Join our new webinar “Harnessing AI Agents for Advanced Fraud Detection” on Feb 13th at 9AM PT  ||  Register today ->

ubiai deep learning

Ensuring Consistent LLM Outputs Using Structured Prompts

Dec 5th, 2024

 

picture of someone who opened ChatGPT tab on his laptop

 

Introduction

 

Large Language Models (LLMs) have revolutionized how we interact with technology, enabling a wide range of applications from content generation to data analysis. However, the effectiveness of these models heavily relies on the quality of the prompts provided.

Structured prompts are essential for guiding LLMs toward generating consistent, relevant, and accurate outputs.

This article dives into the strategies for crafting structured prompts, the significance of clarity and specificity, and the techniques that enhance the reliability of LLM responses.

 

Importance of Structured Prompts

 

Structured prompts are critical in optimizing LLM performance. They help define the context and expectations for the model, thereby reducing ambiguity and enhancing the relevance of the generated responses.

A well-structured prompt can significantly improve the model’s ability to understand the task at hand, leading to outputs that are not only accurate but also aligned with user expectations.

 

conversation between a user and OpenAI ChatGPT; the user asked “ help me write a good prompt” and ChatGPT answered him

 

The Role of Clarity and Specificity

 

One of the core principles of effective prompt engineering is clarity and specificity. Prompts should be articulated in a way that leaves little room for interpretation.

Vague prompts often lead to ambiguous results, hindering the model’s performance. For instance, instead of asking, “Tell me about climate change”, a more structured prompt would be, “Provide a summary of the causes and effects of climate change in three bullet points”.

This specificity guides the LLM toward a focused response, enhancing the overall quality of the output.

 

Contextual Information

Providing contextual information is another vital aspect of structured prompts. Context helps the model understand the nuances of the request, which is particularly important in complex tasks.

For example, if the user is seeking information about a historical event, including the time and geographical location can greatly assist the model in generating a more accurate response.

This approach not only improves the relevance of the output but also reduces the likelihood of the model generating irrelevant information.

 

Techniques for Crafting Structured Prompts

 

1. Use of Clear Separators:

Utilizing clear separators within prompts can enhance their structure.

Special characters, such as colons or bullet points, can help delineate instructions, examples, and expected outputs. For instance, a prompt structured as follows can clarify the task:

Task: Generate a list of three benefits of renewable energy:

1. 2. 3.

This format not only organizes the information but also sets clear expectations for the model, leading to more coherent outputs.

 

2. Task Decomposition:

Breaking down complex tasks into simpler subtasks can significantly improve clarity and performance.

Instead of presenting a monolithic prompt that encompasses multiple tasks, it is more effective to focus on one aspect at a time.

 

Principles of Task Decomposition

  1. Clarify the Goal: Identify the outcome you’re trying to achieve.
  2. Break Down the Goal: Decompose the overall task into distinct sub-tasks or steps.
  3. Sequence the Sub-tasks: Organize the sub-tasks in a logical order, ensuring that one step depends on the previous ones.
  4. Provide Context for Each Step: Give enough context for the model to address each sub-task appropriately.

For example, rather than asking, “Write a research paper about climate change” a structured approach would involve organizing the process into manageable phases:

Good Prompt:

“Help me write a research paper on the effects of climate change on ocean life. Let’s break it down step by step.”

Step 1: “What are the key topics I should cover in this paper?”Step 2: “Write an introduction that defines climate change and its impact on oceans.”Step 3: “Create an outline of sections discussing specific marine life affected by climate change.”Step 4: “Summarize the current scientific understanding of ocean temperature rise and its effects on marine life.”Step 5: “Help me conclude the paper with potential solutions to mitigate these effects.”

This way, the LLM is guided to focus on specific parts of the process one by one, ensuring each section gets thorough attention.

 

3. K-Shot Prompting:

K-shot prompting refers to providing K examples (few-shot, one-shot, or zero-shot) in the input to guide a language model on how to generate responses or solve tasks.

Here’s a breakdown with examples for each scenario:

 

Zero-Shot Prompting

You provide no examples, just instructions for the task.

Translate the following sentence into French:

“The cat is sleeping on the mat.”

 

 

One-Shot Prompting

You provide one example of the task before the input

Translate the following sentences into French:

Example:

“She is reading a book.” -> “Elle lit un livre.”

Now translate:

“The cat is sleeping on the mat.”

 

 

Few-Shot Prompting

Few-shot prompting involves providing the LLM with a few examples of desired input-output pairs.

This technique helps guide the model toward generating higher-quality responses by demonstrating the expected pattern.

 

Example: Summarizing Text

In this example, you’re asking the model to summarize text. By providing a few examples of summaries, you guide it on how to condense information.

 

Let’s Take a Look:

“Here are a few examples of text summaries:

Example 1:

Text: The Eiffel Tower is one of the most famous landmarks in the world, located in Paris, France. It was constructed between 1887 and 1889 as part of the 1889 World’s Fair.

Summary: The Eiffel Tower, built between 1887 and 1889 for the World’s Fair in Paris, is a world-renowned landmark.Example 2:

Text: Water is essential for all forms of life. It makes up about 60% of the human body and plays a key role in digestion, temperature regulation, and transportation of nutrients.

Summary: Water, which makes up about 60% of the human body, is crucial for digestion, temperature regulation, and nutrient transport.Now, please summarize the following text:

Text: The Amazon rainforest, often referred to as the “lungs of the Earth,” produces around 20% of the world’s oxygen. It is home to millions of species and plays a significant role in global climate regulation.”

The model understands the task (summarizing a text) by seeing the format and how to condense the main points into a summary.

 

 

4. Chain-of-Thought Prompting:

Encouraging the model to “think step-by-step” by explicitly prompting it to break down complex tasks into intermediate reasoning steps enhances its ability to solve problems that require logical deduction.

Let’s say you want the model to solve a math problem. Instead of just asking for the answer, you can prompt the model to explain its reasoning.

 

Problem:

A bookstore sells a book for $25. If the price of the book is discounted by 20%, how much does the book cost after the discount?

Please solve this problem step by step:

Problem: A bookstore sells a book for $25. If the price of the book is discounted by 20%, how much does the book cost after the discount?1. First, we need to calculate the amount of the discount.

2. To find the discount, multiply the original price by the discount percentage (20% or 0.20).

3. Subtract the discount from the original price to find the new price of the book.

4. Show the final cost after applying the discount.

 

 

Model Response (following Chain of Thought):

1. To calculate the discount, we multiply the original price by the discount percentage:

25 * 0.20 = 5.

So, the discount is $5.

2. Now, we subtract the discount from the original price:

25 – 5 = 20.3. Therefore, the price of the book after the discount is $20.

This approach not only improves the accuracy of the response but also favors a more logical flow of information.

 

5. Balancing Prompt Length:

While crafting structured prompts, it is crucial to find the optimal prompt length.

Research has shown that excessively long prompts can hinder accuracy, while overly concise prompts may fail to provide sufficient context.

The paper titled “Same Task, More Tokens: The Impact of Input Length on the Reasoning Performance of Large Language Models” reveals that LLMs experience a significant decline in their reasoning abilities when the input length reaches 3,000 tokens, which is considerably shorter than their technical maximum.

 

Graph showing the accuracy of different language models (GPT-3.5, GPT-4, Gemini Pro, Mistral 70B, and Mixtral 8x7B) in reasoning tasks as input length increases (250 to 3000 tokens). Solid lines represent normal processing, while dashed lines represent chain-of-thought (CoT) prompting. Accuracy generally declines with longer input lengths, with GPT-4 maintaining the highest performance across all lengths

 

Striking a balance between providing enough information and avoiding information overload is essential for maximizing LLM performance.

The paper suggests that the optimal prompt length varies depending on the task, dataset, and LLM architecture.

 

Iterative Refinement of Prompts

Prompt engineering is often an iterative process. Initial prompts should be refined based on the LLM’s responses until the desired outcome is achieved.

This iterative refinement allows users to identify what works best for their specific use case and adjust their prompts accordingly. For example, if a prompt yields irrelevant responses, users can analyze the structure and content of the prompt to make necessary adjustments.

This process not only enhances the quality of outputs but also builds a deeper understanding of how the LLM interprets different types of prompts.

 

Fine-Tuning Using UbiAI

 

One of the best ways to ensure structured LLM outputs is to use fine-tuning.

Fine-tune and evaluate your model with UBIAI

  • Prepare your high quality Training Data
  • Train best-in-class LLMs: Build domain-specific models that truly understand your context, fine-tune effortlessly, no coding required
  • Deploy with just few clicks: Go from a fine-tuned model to a live API endpoint with a single click
  • Optimize with confidence: unlock instant, scalable ROI by monitoring and analyzing model performance to ensure peak accuracy and tailored outcomes.

Fine-tuning LLMs refers to the process of adapting a pre-trained language model to a specific task or domain by training it on a specialized dataset.

It leverages LLM’s general knowledge and capabilities while customizing its performance for a specific task.

Let’s see how we can achieve this using Huggingface and UbiAI:

Huggingface is the leading open-source library for models and datasets, designed to centralize access to a vast collection of open-source models in one platform. It simplifies the use of diverse models for various applications, including multimodal tasks, computer vision, natural language processing, and audio generation.

 

picture showing the interface of the “huggingface” platform that conatins datasets and opensource models

 

Let’s access the Dataset section and search for a proper dataset.

Click on Tasks.

 

huggingface platform with a red arrow pointing to tasks sections which is in a red box

 

Pick tasks

Then scroll down to Natural Language Processing and select Text Generation:

 

huggingface platform with a red arrow pointing to tasks sections which is in a red box

 

As you can see, we got thousands of datasets cured for fine-tuning a text generation model.

For this tutorial, we will pick this dataset: rajpurkar/squad_v2

This dataset can be adapted for tasks that require extracting specific fields from text. It also consists of questions posed on a set of Wikipedia articles, along with the corresponding answers, which can be formatted into JSON structures.

Since our objective is to guarantee structured output from an LLM, this dataset is perfect for us as it fine-tunes the model to output JSON output.

All we need to do now is to head over to Files and versions and download the Parquet file, make sure to download the training files:

 

 

One of the common challenges that a developer will find is file format issues. The files of this dataset are in Parquet format, which we need to convert to CSV.

It’s really easy using the famous Pandas library, with 3 lines of code you can solve this issue:

 

import pandas as pd

df = pd.read_parquet(‘train-00000-of-00001.parquet’)

df.to_csv(‘dataset.csv’)

 

Just like that, we made the dataset ready to be used in fine-tuning!

 

The way UbiAI works is that to upload a dataset for fine-tuning purposes you would need to have 4 columns in the csv dataset :

  • System Prompt: Predefined instruction or context provided to the AI to guide its behavior, tone, or response style throughout the interaction.
  • User Prompt: The input or query provided by the user during the interaction.
  • Input: This is a user input example.
  • Output: This is the expected output.

In order to upload the dataset and use it, we need to ensure the columns that UbiAI requires are existent in the dataset we have.

Currently the only column we are missing is the System Prompt, let’s see how we can add a column very easily in MS Excel:

First open the CSV file using Excel:

 

 

Don’t panic just yet! the file may seem messy but we just need to add one single column in each row.

 

Steps to Add a Column to All Rows

  1. Open the CSV File in Excel:
  • Open the file as you normally would in Excel. If the content appears in one column (not split into cells), use the Text to Columns tool:
  • Select the single column. In this case select column A
  • Go to Data > Text to Columns.
  • Choose Delimited, select Comma as the delimiter, and click Finish.
 

 

Perfect!

Now just select the cell after the “answer” and write “System Prompt”

 

 

The next step is to fill all cells of that column with your desired System Prompt, for this tutorial we want to ensure structured JSON output so this is the System Output that we will go with:

 

“You are a helpful assistant designed to provide responses strictly in JSON format. Every answer you give must adhere to the following rules:

1. Always output a valid JSON object or array.

2. Do not include any text or explanation outside the JSON structure.

3. Use keys that are concise yet descriptive.

4. If asked to provide a structured answer, ensure all elements are formatted as JSON.

Example:

For a question like “What are the top programming languages?”, your response should look like:

{

“languages”: [“Python”, “JavaScript”, “Java”]

}”

 

For more complex queries, provide nested structures where appropriate. Ensure the JSON is always properly formatted and free of syntax errors.

 

Finally, just fill the cells with that System Prompt:

 

Fill the Column with the Same Text

  • Type the Text in the First Cell of the Column:
  • Click on the first cell of the column where you want the text to appear and type the desired text.
  • Select the Entire Column:
  • Click on the column header (e.g., A, B, etc.) to select the entire column.
  • Fill the Column with Text:
  • Press Ctrl + D (Windows) or Cmd + D (Mac) to fill the text into all rows of the selected column.

The final file should look like this:

 

 

Let’s head over to UbiAI and finish model fine-tuning.

First, let’s create a Dataset:

 

UbiAI platform in the dataset section , red arrow pointing to “new dataset”

 

The next step is to fill out the dataset details.

  • Dataset Name: Pick a name for your dataset
  • Dataset Type: You want to pick “Text Generation”, this will set the dataset for fine-tuning purposes.
  • Select Language: This is where you would pick the language of the dataset
  • Description: A small description of the dataset.
 

screenshot showing the details to be filled in the UbiAI platform in the dataset section

 

After clicking Next, you are going to fill out the appropriate columns from the dataset in this section.

As we discussed before, you have to fill out the User Prompt, System Prompt, Output(Response), and Input.

User Prompt

 

System Prompt

 

Response

 

Input

 

This step is crucial for mapping the data and making the training easier.

 

After clicking “Finish”, you have to validate each row of the dataset either manually or automatically.

To validate the rows automatically you can press Select All and then Validate

The validated rows will go into training the model later.

 

screenshot of the UbiAI platform in the dataset section showing how files are validated

 

Finally, after validating the whole dataset, what’s left is to train an LLM.

Head over to the models section and click on “New Model:

 

screenshot showing how to create a new model in UbiAI platform

 

After, you are going to fill out the Model Details, just like in the Dataset section.

One thing to remember is that you must select “Text Generation” in the Model Category option. This will allow us to pick an LLM to fine-tune:

 

screenshot showing the model details needed to be filled

 

Click “Next”.

Now you would want to assign the dataset that we uploaded and validated

Click “Finish”.

You will be prompted to “Model created successfully”, Congrats on creating your first LLM on UbiAI!

Now let’s head over to the model’s details and adjust some parameters before training:

 

screenshot showing model card on UbiAI platform with red arrow pointing to more details of the model created

 

Here you can see the Dashboard where you can view the model characteristics :

  • Dashboard: Here you can view the overall status of the model
  • Training History: This is where you will find all the previous training efforts on the model.
  • Dataset: This is where you would find the dataset the model is trained on
  • Playground: In Playground, you can chat with the model and try it out after training.
  • Training Evaluation: In this section, you can view the evaluation of the model after training, you can have a look into the different metrics used to calculate the model’s performance: F1 Score, Precision, and Recall
  • Confusion Matrix: A confusion matrix is a table used to evaluate the performance of a classification algorithm, particularly in supervised learning. It compares the actual values (ground truth) with the predicted values (model output). The matrix shows the number of correct and incorrect predictions, broken down by class. Currently, Confusion Matrix is not supported in the LLMs world but you can use it in Supervised Learning.

Now the moment you’ve been waiting for! It’s time for model training!

Head over to the Dashboard, and choose the model, in this tutorial we are going to pick llama-3–1–8b-instruct

 

screenshot showing the dashboard part of the model in the UbiAI platform

 

Click on “Start Model Training”.

screenshot showing that the model is training in the UbiAI platform

 

And Voilà! Your model is now being fine-tuned on your dataset! Yes, it’s that easy!

 

The model has finished training and now ready to be used!

Let’s try it out!

 

 

 

Conclusion

 

In conclusion, achieving consistent outputs from LLMs through structured prompts is a complex process that calls for a focus on clarity, precision, and context.

 

Using methods like clear separators, breaking tasks into smaller steps, few-shot prompting, and chain-of-thought prompting can greatly improve the quality of LLM responses.

Additionally, finding the right balance in prompt length and refining prompts through repeated adjustments are key steps to maximizing performance.

 

As prompt engineering evolves, staying flexible and fine-tuning strategies will remain essential for making the most of these advanced models.

 

Fine-tuning is also one of the best methods to ensure structured model output. Leveraging platforms designed to streamline the fine-tuning process, such as UbiAI, can significantly enhance model performance.

These platforms offer user-friendly interfaces and powerful tools, making it easier to refine and optimize models. The ability to fine-tune models with precision ensures that the generated outputs are more aligned with specific use cases, enhancing their relevance and reliability. Fine-tuning will continue to be a crucial step for anyone looking to achieve high-quality, structured model outputs.

Unlocking the Power of SLM Distillation for Higher Accuracy and Lower Cost​

How to make smaller models as intelligent as larger ones

Recording Date : March 7th, 2025

Unlock the True Potential of LLMs !

Harnessing AI Agents for Advanced Fraud Detection

How AI Agents Are Revolutionizing Fraud Detection

Recording Date : February 13th, 2025

Unlock the True Potential of LLMs !

Thank you for registering!

Check your email for the live demo details

see you on February 19th

While you’re here, discover how you can use UbiAI to fine-tune highly accurate and reliable AI models!

Thank you for registering!

Check your email for webinar details

see you on March 5th

While you’re here, discover how you can use UbiAI to fine-tune highly accurate and reliable AI models!

Fine Tuning LLMs on Your Own Dataset ​

Fine-Tuning Strategies and Practical Applications

Recording Date : January 15th, 2025

Unlock the True Potential of LLMs !