
A Comprehensive Hands-On Guide¶
In this Blog, we’ll explore how to build AI agents that learn and improve from their own experiences using UBIAI framework. Unlike traditional AI agents that remain static after deployment, UBIAI enables continuous learning through reinforcement learning techniques.
What You’ll Learn¶
- Understanding the core concepts
- Setting up a complete training environment
- Building a SQL-generation agent using LangGraph
- Implementing reinforcement learning workflows
- Debugging and optimizing agent performance
Introduction to UBIAI RLFT¶
The Challenge with Traditional AI Agents¶
Modern AI agents powered by large language models can perform impressive tasks – from answering questions to executing complex workflows. However, they suffer from a critical limitation: they don’t learn from their mistakes
Traditional frameworks like LangChain, AutoGen, and CrewAI provide excellent tools for building agents, but once deployed, these agents operate with fixed behaviors. When they encounter errors or suboptimal outcomes, they can’t automatically adjust their strategy.
UBIAI’s Solution¶

UBIAI introduces a learning layer on top of existing agent frameworks. The key innovation is agent disaggregation – separating the execution logic from the learning mechanism. This allows:
- API integration with existing agents (LangChain, LangGraph, AutoGen, CrewAI, custom Python)
- Automatic learning from success and failure signals
- Progressive improvement through reinforcement learning cycles
- Flexible reward systems for intermediate and final outcomes
Understanding UBIAI’s Approach to AI Training¶
What Makes UBIAI Different?¶
Traditional AI platforms require you to manage everything: data preprocessing, model selection, training infrastructure, and deployment. UBIAI simplifies this entire pipeline into a unified platform specifically designed for natural language processing and annotation tasks.
How Reinforcement Learning Works in UBIAI¶
UBIAI uses KTO (Kahneman-Tversky Optimization), a reinforcement learning algorithm designed for learning from human preferences. Here’s how it works:
1. Your model generates an SQL query
2. The query is executed and validated
3. Feedback is collected: ✓ (correct) or ✗ (incorrect)
4. This feedback is stored as training data
5. Periodically, the model retrains on accumulated feedback
6. The model learns to generate better queries over time
This creates a virtuous cycle where your model becomes more accurate with every deployment cycle.
The UBIAI Data Format¶
UBIAI uses a simple CSV format for training data with three columns:
- System Prompt: Instructions that set the model’s behavior
- User Prompt: The specific input (e.g., “What are the names of all students?”)
- Response: The expected output (the SQL query)
For reinforcement learning, you also label each example as positive (correct) or negative (incorrect). The KTO algorithm uses these labels to understand what good outputs look like.
Now let’s dive into building our system!
Part 1: Environment Setup¶
First, we’ll set up our Python environment with all the necessary libraries. We’re keeping dependencies minimal—just what we need for API communication, data handling, and visualization.
# Install required dependencies
!pip install -q requests pandas matplotlib seaborn sqlalchemy
# Import essential libraries
import requests
import json
import pandas as pd
import sqlite3
import os
import time
from typing import Dict, List, Any, Tuple, Optional
from datetime import datetime
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path
# Configure visualization settings
sns.set_style("whitegrid")
plt.rcParams['figure.figsize'] = (12, 6)
print("✓ All dependencies loaded successfully!")
Configuring Your UBIAI Connection¶
Next, we need to connect to your UBIAI account. You’ll need three pieces of information:
- UBIAI Token: Your API authentication token (format:
/your-token-id) - Project ID: The ID of your UBIAI project containing the training data
- Base Model: The foundation model you’re fine-tuning (e.g.,
llama-3-1-8b-instruct)
Where to find these:
- Token: UBIAI Dashboard → Settings → API Keys
- Project ID: UBIAI Dashboard → Your Project → Project Settings
- Base Model: The model you selected when creating your project
Replace the placeholder values below with your actual credentials:
# UBIAI Configuration
# ⚠️ IMPORTANT: Replace these with your actual credentials
UBIAI_TOKEN = "" # Your UBIAI API token
UBIAI_PROJECT_ID = "" # Your project ID
# API Endpoints (no need to change these)
UBIAI_BASE_URL = "https://api.ubiai.tools:8443"
UBIAI_ANNOTATE_URL = f"{UBIAI_BASE_URL}/api_v1/annotate" # For inference
UBIAI_TRAIN_URL = f"{UBIAI_BASE_URL}/api_v1/train_model" # For training
# Model Settings
BASE_MODEL = "llama-3-1-8b-instruct" # Your fine-tuned model base
TEMPERATURE = 0.7 # Controls randomness (0.0 = deterministic, 1.0 = creative)
print("UBIAI Configuration Summary:")
print("="*50)
print(f" API Base URL: {UBIAI_BASE_URL}")
print(f" Project ID: {UBIAI_PROJECT_ID}")
print(f" Base Model: {BASE_MODEL}")
print(f" Temperature: {TEMPERATURE}")
print("="*50)
print("\n✓ Configuration complete!")
Part 2: Building the UBIAI API Client¶
Now we’ll create a Python client to communicate with UBIAI’s APIs. This client will handle two main operations:
- SQL Generation (Inference): Send a natural language question and receive a SQL query (This is a model we finetuned on ubiai before and that already focuses o generating sql queries)
- Model Training: Initiate reinforcement learning training with new feedback data
The client abstracts away the HTTP request details, giving you simple Python methods to work with.
class UBIAISQLClient:
"""
Client for interacting with UBIAI's SQL generation and training APIs.
This class provides two main capabilities:
1. Generate SQL queries from natural language (inference)
2. Initiate model training with reinforcement learning
"""
def __init__(self, token: str, project_id: str, base_model: str = BASE_MODEL):
"""
Initialize the UBIAI client.
Args:
token: Your UBIAI API authentication token
project_id: Your UBIAI project ID
base_model: The base model name (e.g., llama-3-1-8b-instruct)
"""
self.token = token
self.project_id = project_id
self.base_model = base_model
self.annotate_url = UBIAI_ANNOTATE_URL
self.train_url = UBIAI_TRAIN_URL
def generate_sql(
self,
question: str,
schema: Optional[str] = None,
temperature: float = TEMPERATURE,
context: Optional[str] = None
) -> Dict[str, Any]:
"""
Generate SQL query from a natural language question.
This method calls UBIAI's inference API to generate a SQL query.
You can provide additional context like database schema to improve accuracy.
Args:
question: Natural language question (e.g., "What are all student names?")
schema: Optional database schema to provide context
temperature: Sampling temperature (0.0-1.0, higher = more creative)
context: Any additional context to help the model
Returns:
Dictionary containing:
- success: Whether generation succeeded
- query: The generated SQL query (if successful)
- error: Error message (if failed)
"""
# Build the system prompt - this sets the model's behavior
system_prompt = (
"You are an expert SQL developer. Generate precise, executable SQL queries "
"from natural language questions. Return ONLY the SQL query without explanations "
"or markdown formatting."
)
# Build the user prompt with all provided context
user_prompt = f"Question: {question}"
if schema:
user_prompt += f"\n\nDatabase Schema:\n{schema}"
if context:
user_prompt += f"\n\nAdditional Context:\n{context}"
user_prompt += "\n\nGenerate the SQL query:"
# Prepare the API request payload
data = {
"input_text": question,
"system_prompt": system_prompt,
"user_prompt": user_prompt,
"temperature": temperature,
"monitor_model": True, # Enable usage monitoring
"knowledge_base_ids": [],
"images_urls": []
}
try:
# Make the API request
response = requests.post(
self.annotate_url + self.token,
json=data,
timeout=30 # 30 second timeout
)
response.raise_for_status()
result = response.json()
# Extract and clean the generated SQL
if "output" in result:
sql = result["output"].strip()
# Remove markdown code blocks if present
sql = sql.replace("```sql", "").replace("```", "").strip()
return {
"success": True,
"query": sql,
"raw_response": result
}
else:
return {
"success": False,
"error": "No output in response",
"raw_response": result
}
except requests.exceptions.RequestException as e:
return {
"success": False,
"error": str(e),
"query": None
}
def initiate_training(
self,
training_type: str = "reinforcement-kto",
num_iterations: int = 20,
batch_size: int = 4,
dropout: float = 0.35,
validation_split: int = 20
) -> Dict[str, Any]:
"""
Initiate model training using UBIAI's training API.
This method starts a training job on UBIAI's servers. The training
will use the data already uploaded to your UBIAI project.
Args:
training_type: 'normale' (standard) or 'reinforcement-kto' (RL with KTO)
num_iterations: Number of training iterations/epochs
batch_size: Number of examples per training batch
dropout: Dropout rate for regularization (0.0-1.0)
validation_split: Percentage of data to use for validation
Returns:
Dictionary containing:
- success: Whether training was initiated
- response: Training job details from UBIAI
- error: Error message (if failed)
"""
# Prepare training configuration
data = {
"drop": dropout,
"max_batch": batch_size,
"nb_iter": str(num_iterations),
"project": self.project_id,
"selected_model": self.base_model,
"selected_validation": str(validation_split),
"model_type": "LLM",
"with_annotate": False,
"training_type": training_type,
}
try:
# Initiate training via API
response = requests.post(
self.train_url + self.token,
data=data,
timeout=60
)
response.raise_for_status()
result = response.json()
return {
"success": True,
"response": result,
"training_config": data
}
except requests.exceptions.RequestException as e:
return {
"success": False,
"error": str(e)
}
# Initialize the UBIAI client
ubiai_client = UBIAISQLClient(
token=UBIAI_TOKEN,
project_id=UBIAI_PROJECT_ID,
base_model=BASE_MODEL
)
print("✓ UBIAI Client initialized successfully!")
print("\nYou can now:")
print(" 1. Generate SQL queries with: ubiai_client.generate_sql()")
print(" 2. Initiate training with: ubiai_client.initiate_training()")
Testing Your Model: First SQL Generation¶
Before building the full system, let’s verify that your UBIAI model is working correctly. We’ll send a simple question and see what SQL query it generates.
What to expect:
- If successful, you’ll see a clean SQL query
- If there’s an error, check your API token and project ID
- The quality of the output depends on your model’s training
Let’s try it:
# Test with a simple question
test_question = "What are the names of all students?"
# Provide database schema for context
test_schema = "CREATE TABLE students (id INT, name VARCHAR(100), age INT, gpa FLOAT);"
print(f"Question: {test_question}")
print(f"Schema: {test_schema}")
print("\nGenerating SQL query...\n")
result = ubiai_client.generate_sql(
question=test_question,
schema=test_schema,
temperature=0.7
)
print("="*70)
if result["success"]:
print("✓ Query generated successfully!")
print(f"\nGenerated SQL:\n{result['query']}")
else:
print("✗ Generation failed")
print(f"Error: {result['error']}")
print("\nTroubleshooting:")
print(" - Verify your UBIAI token is correct")
print(" - Check that your project ID is valid")
print(" - Ensure your model is deployed in UBIAI")
print("="*70)
Part 3: Building a Production-Ready SQL Agent¶
Now we’re getting to the exciting part: building a production-ready agent that can safely generate and execute SQL queries. In production environments, safety is paramount. You don’t want a bug or malicious input causing your agent to drop tables or delete critical data.
Safety Features We’re Implementing:¶
- Query Validation: Inspect every generated query before execution
- Forbidden Keywords: Block destructive operations (DROP, DELETE, UPDATE, etc.)
- Read-Only Mode: Ensure queries can only read data, not modify it
- SQL Injection Prevention: Detect and block suspicious patterns
- Multiple Statement Protection: Prevent query chaining attacks
- Comprehensive Logging: Track every query for auditing and debugging
How It Works:¶
User Question → Generate SQL → Validate Query → Execute (optional) → Log Results
↓
Block if dangerous
Let’s build it:
class ProductionSQLAgent:
"""
Production-ready SQL agent with comprehensive safety features.
This agent acts as a protective layer between user questions and your database.
It generates SQL queries using UBIAI, validates them for safety, and optionally
executes them with proper error handling.
"""
# List of SQL keywords that should be blocked in read-only mode
FORBIDDEN_KEYWORDS = [
"DROP", # Drop tables/databases
"DELETE", # Delete data
"UPDATE", # Modify data
"INSERT", # Insert new data
"ALTER", # Alter table structure
"CREATE", # Create new tables
"TRUNCATE", # Remove all rows
"GRANT", # Grant permissions
"REVOKE", # Revoke permissions
"REPLACE" # Replace data
]
def __init__(
self,
ubiai_client: UBIAISQLClient,
read_only: bool = True,
max_retries: int = 2,
enable_logging: bool = True
):
"""
Initialize the production SQL agent.
Args:
ubiai_client: Initialized UBIAI client for SQL generation
read_only: If True, block all data modification operations
max_retries: Number of retry attempts if generation fails
enable_logging: Whether to log all queries and results
"""
self.ubiai_client = ubiai_client
self.read_only = read_only
self.max_retries = max_retries
self.enable_logging = enable_logging
self.query_log = [] # Store all query executions
# Initialize statistics tracking
self.stats = {
"total_queries": 0,
"successful_queries": 0,
"failed_queries": 0,
"blocked_queries": 0
}
def validate_query(self, query: str) -> Tuple[bool, Optional[str]]:
"""
Validate SQL query for safety before execution.
This method performs multiple security checks on the generated query
to prevent malicious or destructive operations.
Args:
query: The SQL query to validate
Returns:
Tuple of (is_valid: bool, error_message: Optional[str])
"""
# Check if query is valid string
if not query or not isinstance(query, str):
return False, "Invalid query format"
query_upper = query.upper()
# Check for forbidden operations (only in read-only mode)
if self.read_only:
for keyword in self.FORBIDDEN_KEYWORDS:
if keyword in query_upper:
return False, f"Forbidden operation detected: {keyword} (read-only mode)"
# Check for multiple statements (SQL injection risk)
semicolon_count = query.count(";")
if semicolon_count > 1:
return False, "Multiple statements not allowed (security risk)"
# Check for SQL injection patterns
suspicious_patterns = [
"--", # SQL comments
"/*", # Block comments
"*/", # Block comments
"xp_", # SQL Server extended procedures
"sp_" # SQL Server stored procedures
]
for pattern in suspicious_patterns:
if pattern in query:
return False, f"Suspicious pattern detected: {pattern} (potential SQL injection)"
# All checks passed
return True, None
def execute_query(
self,
db_path: str,
query: str,
timeout: int = 10
) -> Tuple[Any, Optional[str]]:
"""
Execute SQL query against a SQLite database with timeout protection.
Args:
db_path: Path to the SQLite database file
query: SQL query to execute
timeout: Maximum execution time in seconds
Returns:
Tuple of (results: Any, error_message: Optional[str])
"""
try:
# Connect to database with timeout
conn = sqlite3.connect(db_path, timeout=timeout)
cursor = conn.cursor()
# Execute query
cursor.execute(query)
results = cursor.fetchall()
# Clean up
conn.close()
return results, None
except sqlite3.OperationalError as e:
return None, f"Database operation failed: {str(e)}"
except Exception as e:
return None, f"Execution error: {str(e)}"
def generate_and_validate(
self,
question: str,
db_path: Optional[str] = None,
schema: Optional[str] = None,
execute: bool = False,
verbose: bool = True
) -> Dict[str, Any]:
"""
Complete pipeline: Generate SQL, validate, and optionally execute.
This is the main method you'll use in production. It orchestrates
the entire workflow from question to result.
Args:
question: Natural language question
db_path: Database path (required if execute=True)
schema: Optional database schema for context
execute: Whether to execute the query (default: False for safety)
verbose: Print detailed progress information
Returns:
Dictionary containing:
- timestamp: When the query was processed
- question: Original question
- query: Generated SQL
- validation_passed: Whether validation succeeded
- execution_successful: Whether execution succeeded (if executed)
- results: Query results (if executed)
- error: Error message (if any)
- duration_ms: Processing time in milliseconds
"""
start_time = datetime.now()
if verbose:
print(f"\n{'='*70}")
print(f"Question: {question}")
print("-" * 70)
# Initialize result dictionary
result = {
"timestamp": start_time.isoformat(),
"question": question,
"query": None,
"validation_passed": False,
"execution_successful": False,
"results": None,
"error": None
}
# Step 1: Generate SQL query
if verbose:
print("[1/3] Generating SQL query...")
gen_result = self.ubiai_client.generate_sql(
question=question,
schema=schema
)
if not gen_result["success"]:
result["error"] = f"Generation failed: {gen_result['error']}"
self.stats["failed_queries"] += 1
if verbose:
print(f" ✗ {result['error']}")
self._log_query(result)
return result
query = gen_result["query"]
result["query"] = query
if verbose:
print(f" ✓ Generated: {query}")
# Step 2: Validate query for safety
if verbose:
print("[2/3] Validating query...")
is_valid, validation_error = self.validate_query(query)
if not is_valid:
result["error"] = f"Validation failed: {validation_error}"
self.stats["blocked_queries"] += 1
if verbose:
print(f" ✗ {validation_error}")
self._log_query(result)
return result
result["validation_passed"] = True
if verbose:
print(" ✓ Validation passed")
# Step 3: Execute query (if requested)
if execute:
if verbose:
print("[3/3] Executing query...")
if not db_path:
result["error"] = "Database path required for execution"
self.stats["failed_queries"] += 1
if verbose:
print(f" ✗ {result['error']}")
else:
exec_results, exec_error = self.execute_query(db_path, query)
if exec_error:
result["error"] = f"Execution failed: {exec_error}"
self.stats["failed_queries"] += 1
if verbose:
print(f" ✗ {exec_error}")
else:
result["execution_successful"] = True
result["results"] = exec_results
self.stats["successful_queries"] += 1
if verbose:
result_preview = exec_results[:3] if len(exec_results) > 3 else exec_results
print(f" ✓ Execution successful")
print(f" Results ({len(exec_results)} rows): {result_preview}")
if len(exec_results) > 3:
print(f" ... and {len(exec_results) - 3} more rows")
else:
self.stats["successful_queries"] += 1
if verbose:
print("[3/3] Skipping execution (execute=False)")
self.stats["total_queries"] += 1
# Calculate processing duration
end_time = datetime.now()
result["duration_ms"] = (end_time - start_time).total_seconds() * 1000
if verbose:
print(f"\nCompleted in {result['duration_ms']:.0f}ms")
print("="*70)
self._log_query(result)
return result
def _log_query(self, result: Dict[str, Any]):
"""Internal method to log query execution."""
if self.enable_logging:
self.query_log.append(result)
def get_statistics(self) -> Dict[str, Any]:
"""Get agent performance statistics."""
total = self.stats["total_queries"]
success_rate = self.stats["successful_queries"] / total if total > 0 else 0
return {
**self.stats,
"success_rate": success_rate
}
def export_logs(self, filepath: str):
"""Export query logs to CSV file for analysis."""
if self.query_log:
df = pd.DataFrame(self.query_log)
df.to_csv(filepath, index=False)
print(f"✓ Logs exported to: {filepath}")
print(f" Total entries: {len(df)}")
else:
print("No logs to export")
# Initialize the production agent
agent = ProductionSQLAgent(
ubiai_client=ubiai_client,
read_only=True, # Enable read-only mode for safety
max_retries=2, # Retry failed requests up to 2 times
enable_logging=True # Log all queries
)
print("✓ Production SQL Agent initialized!")
print("\nSafety Features Enabled:")
print(f" • Read-only mode: {agent.read_only}")
print(f" • Query validation: Enabled")
print(f" • Logging: {agent.enable_logging}")
print(f" • Blocked keywords: {len(agent.FORBIDDEN_KEYWORDS)}")
Testing the Production Agent¶
Let’s put our production agent through its paces with several test questions. This will demonstrate:
- How the agent handles valid queries
- How safety features block dangerous operations
- The logging and statistics tracking in action
We’ll test with increasingly complex questions to see how the model performs:
# Define a comprehensive set of test questions
test_questions = [
"Show all student names and their GPAs",
"What is the average age of students?",
"List students with GPA greater than 3.5",
"Count how many students are there",
"Find the oldest student",
"Show the top 5 students by GPA",
"DELETE FROM students WHERE gpa < 2.0" # This should be BLOCKED
]
# Database schema for context
example_schema = """
CREATE TABLE students (
id INTEGER PRIMARY KEY,
name VARCHAR(100) NOT NULL,
age INTEGER,
gpa FLOAT CHECK (gpa >= 0.0 AND gpa <= 4.0)
);
"""
print("Testing Production Agent")
print(f"Total test questions: {len(test_questions)}")
print(f"Schema provided: Yes")
print(f"Execution mode: Validation only (no database execution)\n")
# Run all test questions
for i, question in enumerate(test_questions, 1):
print(f"\n{'='*70}")
print(f"Test {i}/{len(test_questions)}")
print(f"{'='*70}")
result = agent.generate_and_validate(
question=question,
schema=example_schema,
execute=False, # Just validate, don't execute
verbose=True
)
# Small delay to respect rate limits
time.sleep(0.5)
# Display final statistics
print("\n\n" + "="*70)
print("AGENT PERFORMANCE STATISTICS")
print("="*70)
stats = agent.get_statistics()
print(f"Total Queries: {stats['total_queries']}")
print(f"Successful: {stats['successful_queries']}")
print(f"Failed: {stats['failed_queries']}")
print(f"Blocked (Security): {stats['blocked_queries']}")
print(f"Success Rate: {stats['success_rate']:.2%}")
print("="*70)
Part 4: Feedback Collection and Self Improving System¶

This is where the magic happens. To enable continuous improvement through reinforcement learning, we need to collect feedback on every query the agent generates. This feedback becomes the training data for the next iteration of your model.
Understanding the Feedback Loop:¶
In production, here’s how feedback collection works:
- User asks a question → Agent generates SQL
- Query executes → Results are returned
- User provides feedback → “This is correct” or “This is wrong”
- System stores feedback → Question + Query + Label (positive/negative)
- Threshold reached → Export feedback and retrain model
UBIAI’s Required Format:¶
UBIAI expects training data as a CSV with these columns:
- System Prompt: Instructions for the model
- User Prompt: The input (question + schema)
- Response: The expected output (SQL query)
For reinforcement learning, each row also gets a label: positive (correct) or negative (incorrect).
Our FeedbackCollector class automates this entire process:
class FeedbackCollector:
"""
Collect and manage user feedback for continuous model improvement.
This class handles the entire feedback lifecycle:
1. Collect feedback from production usage
2. Format feedback in UBIAI's required CSV format
3. Monitor feedback accumulation
4. Trigger retraining when threshold is reached
5. Export data ready for UBIAI upload
"""
def __init__(self, retrain_threshold: int = 50):
"""
Initialize feedback collector.
Args:
retrain_threshold: Number of feedback examples before suggesting retrain
"""
self.feedback_data = []
self.retrain_threshold = retrain_threshold
self.last_export_time = datetime.now()
def add_feedback(
self,
question: str,
generated_query: str,
is_correct: bool,
schema: Optional[str] = None,
user_comment: Optional[str] = None,
correct_query: Optional[str] = None
):
"""
Add a feedback entry from production usage.
Args:
question: The user's natural language question
generated_query: SQL query generated by the model
is_correct: Whether the query was correct (True/False)
schema: Database schema that was provided
user_comment: Optional comment from user
correct_query: Optional correct query (if user provided one)
"""
# Build system prompt (consistent with training)
system_prompt = (
"You are an expert SQL developer. Generate precise, executable SQL queries "
"from natural language questions. Return ONLY the SQL query without explanations "
"or markdown formatting."
)
# Build user prompt
user_prompt = f"Question: {question}"
if schema:
user_prompt += f"\n\nDatabase Schema:\n{schema}"
user_prompt += "\n\nGenerate the SQL query:"
# Determine which query to use as the response
response_query = correct_query if correct_query else generated_query
# Create feedback entry in UBIAI format
feedback = {
"timestamp": datetime.now().isoformat(),
"System Prompt": system_prompt, # UBIAI column 1
"User Prompt": user_prompt, # UBIAI column 2
"Response": response_query, # UBIAI column 3
"is_correct": is_correct,
"label": "positive" if is_correct else "negative",
"user_comment": user_comment,
"original_question": question,
"generated_query": generated_query
}
self.feedback_data.append(feedback)
# Check if we've reached the retraining threshold
if len(self.feedback_data) >= self.retrain_threshold:
print(f"\n⚡ RETRAIN THRESHOLD REACHED!")
print(f" Collected: {len(self.feedback_data)} feedback examples")
print(f" Threshold: {self.retrain_threshold}")
print(f"\n Recommendation: Export feedback and initiate retraining")
def get_statistics(self) -> Dict[str, Any]:
"""Get current feedback statistics."""
if not self.feedback_data:
return {"message": "No feedback collected yet"}
df = pd.DataFrame(self.feedback_data)
positive_count = (df["is_correct"] == True).sum()
negative_count = (df["is_correct"] == False).sum()
accuracy = positive_count / len(df) if len(df) > 0 else 0
until_retrain = max(0, self.retrain_threshold - len(self.feedback_data))
return {
"total_feedback": len(self.feedback_data),
"positive_feedback": positive_count,
"negative_feedback": negative_count,
"accuracy": accuracy,
"until_retrain": until_retrain,
"ready_for_retrain": until_retrain == 0
}
def export_for_ubiai(self, output_dir: str = "data") -> str:
"""
Export feedback in UBIAI's required CSV format.
The exported CSV will have three columns:
- System Prompt
- User Prompt
- Response
Args:
output_dir: Directory to save the CSV file
Returns:
Path to the exported CSV file
"""
if not self.feedback_data:
print("⚠️ No feedback to export")
return None
# Create output directory
Path(output_dir).mkdir(exist_ok=True)
# Generate timestamped filename
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
filepath = f"{output_dir}/ubiai_training_data_{timestamp}.csv"
# Create DataFrame with UBIAI's required columns
df = pd.DataFrame(self.feedback_data)
# Export only the columns UBIAI needs
ubiai_columns = ["System Prompt", "User Prompt", "Response"]
df_export = df[ubiai_columns]
# Save to CSV
df_export.to_csv(filepath, index=False)
# Also save full data with labels for reference
full_filepath = f"{output_dir}/feedback_full_{timestamp}.csv"
df.to_csv(full_filepath, index=False)
# Print summary
print(f"\n{'='*70}")
print("✓ FEEDBACK EXPORTED SUCCESSFULLY")
print(f"{'='*70}")
print(f"\nUBIAI Training File: {filepath}")
print(f"Full Data (with labels): {full_filepath}")
print(f"\nExport Summary:")
print(f" Total entries: {len(df)}")
print(f" Positive (correct): {(df['is_correct'] == True).sum()}")
print(f" Negative (incorrect): {(df['is_correct'] == False).sum()}")
print(f" Accuracy: {df['is_correct'].mean():.2%}")
print(f"\n📋 NEXT STEPS:")
print(f" 1. Open UBIAI platform: https://ubiai.tools")
print(f" 2. Navigate to Project ID: {UBIAI_PROJECT_ID}")
print(f" 3. Upload file: {filepath}")
print(f" 4. Label examples as 'positive' or 'negative' in UBIAI")
print(f" 5. Run the training cell in Part 5 below")
print(f"{'='*70}\n")
self.last_export_time = datetime.now()
return filepath
def visualize_feedback(self):
"""Create visualizations of feedback trends."""
if not self.feedback_data:
print("No feedback to visualize")
return
df = pd.DataFrame(self.feedback_data)
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
# Chart 1: Feedback Distribution
label_counts = df["label"].value_counts()
colors = ['#2ecc71' if label == 'positive' else '#e74c3c' for label in label_counts.index]
axes[0].bar(label_counts.index, label_counts.values, color=colors)
axes[0].set_title('Feedback Distribution', fontsize=14, fontweight='bold')
axes[0].set_ylabel('Count')
axes[0].set_xlabel('Label')
# Add value labels on bars
for i, v in enumerate(label_counts.values):
axes[0].text(i, v + 0.5, str(v), ha='center', fontweight='bold')
# Chart 2: Cumulative Accuracy Over Time
df['timestamp'] = pd.to_datetime(df['timestamp'])
df = df.sort_values('timestamp')
df['cumulative_accuracy'] = df['is_correct'].expanding().mean()
axes[1].plot(range(len(df)), df['cumulative_accuracy'], linewidth=2, color='#3498db')
axes[1].axhline(y=0.8, color='#2ecc71', linestyle='--', alpha=0.5, label='Target (80%)')
axes[1].set_title('Cumulative Accuracy Over Time', fontsize=14, fontweight='bold')
axes[1].set_xlabel('Number of Queries')
axes[1].set_ylabel('Accuracy')
axes[1].set_ylim([0, 1])
axes[1].grid(True, alpha=0.3)
axes[1].legend()
plt.tight_layout()
# Save visualization
viz_path = 'data/feedback_analysis.png'
Path('data').mkdir(exist_ok=True)
plt.savefig(viz_path, dpi=300, bbox_inches='tight')
plt.show()
print(f"\n✓ Visualization saved to: {viz_path}")
# Initialize feedback collector
feedback_collector = FeedbackCollector(retrain_threshold=50)
print("✓ Feedback Collector initialized!")
print(f"\nConfiguration:")
print(f" Retrain threshold: {feedback_collector.retrain_threshold} examples")
print(f" Export format: UBIAI CSV (System Prompt, User Prompt, Response)")
print(f" Automatic threshold monitoring: Enabled")
Simulating Production Usage with Feedback Collection¶
Now let’s simulate real usage where users submit questions and provide feedback on the results. In production, this feedback would come from actual users evaluating whether the generated SQL queries are correct.
In a real production system, feedback collection might look like:
# User submits question
result = agent.generate_and_validate(user_question, schema=db_schema)
# Query executes and user reviews results
user_feedback = get_user_feedback() # "correct" or "incorrect"
# Collect feedback
feedback_collector.add_feedback(
question=user_question,
generated_query=result['query'],
is_correct=(user_feedback == "correct"),
schema=db_schema
)
For this tutorial, we’ll simulate this with several test queries:
# Simulate production usage with feedback collection
import random
production_questions = [
"What are the names of all students?",
"Find students with GPA above 3.5",
"Calculate the average student age",
"How many students are enrolled?",
"Show the top 5 students by GPA",
"List students younger than 20",
"What's the highest GPA in the database?",
"Find all students named John",
"Show students ordered by age descending",
"Count students in each age group",
"List students with GPA between 3.0 and 3.5",
"Find the youngest student",
"Show all students with their ages",
"Calculate the median GPA",
"List students whose names start with 'A'"
]
print("Simulating Production Usage")
print("="*70)
print(f"Total queries: {len(production_questions)}")
print(f"Feedback: Automatic (simulated user responses)")
print(f"Schema: Provided for all queries\n")
for i, question in enumerate(production_questions, 1):
print(f"\n[{i}/{len(production_questions)}] {question[:60]}...")
# Generate query
result = agent.generate_and_validate(
question=question,
schema=example_schema,
execute=False,
verbose=False
)
# Simulate user feedback
# In production, this would come from actual user evaluation
# For simulation: 85% success rate (model performs well but not perfect)
is_correct = result["validation_passed"] and random.random() > 0.15
# Collect feedback if query was generated
if result["query"]:
feedback_collector.add_feedback(
question=question,
generated_query=result["query"],
is_correct=is_correct,
schema=example_schema,
user_comment="Simulated production feedback"
)
status = "✓" if is_correct else "✗"
print(f" Generated: {result['query'][:60]}...")
print(f" Feedback: {status} {'Correct' if is_correct else 'Incorrect'}")
else:
print(f" ✗ Generation failed: {result['error']}")
# Rate limiting
time.sleep(0.3)
print("\n" + "="*70)
print("FEEDBACK COLLECTION SUMMARY")
print("="*70)
stats = feedback_collector.get_statistics()
if "message" not in stats:
print(f"\nTotal Feedback Collected: {stats['total_feedback']}")
print(f" Positive (Correct): {stats['positive_feedback']}")
print(f" Negative (Incorrect): {stats['negative_feedback']}")
print(f" Current Accuracy: {stats['accuracy']:.2%}")
print(f"\nRetraining Status:")
if stats['ready_for_retrain']:
print(f" ✓ Ready for retraining!")
else:
print(f" ⏳ Need {stats['until_retrain']} more examples")
print(f" Progress: {stats['total_feedback']}/{feedback_collector.retrain_threshold}")
else:
print(stats['message'])
Visualizing Feedback Trends¶
Let’s create visualizations to understand how our model is performing over time:
# Visualize feedback data
feedback_collector.visualize_feedback()
Part 5: Export Feedback for UBIAI Training¶
Now comes the crucial step: exporting your collected feedback in the exact format UBIAI expects. This exported CSV file will be uploaded to your UBIAI project and used for reinforcement learning training.
What happens when you run this:
- Creates a CSV with columns:
System Prompt,User Prompt,Response - Also creates a full export with all metadata (labels, timestamps, etc.)
- Provides step-by-step instructions for uploading to UBIAI
Important: The exported CSV is ready to upload, but you still need to label each example as “positive” or “negative” in the UBIAI interface before training.
# Export feedback for UBIAI training
feedback_file = feedback_collector.export_for_ubiai(output_dir="data")
if feedback_file:
print("\nPreview of exported data (first 3 rows):")
print("="*70)
df_preview = pd.read_csv(feedback_file)
for idx, row in df_preview.head(3).iterrows():
print(f"\nRow {idx + 1}:")
print(f" System Prompt: {row['System Prompt'][:80]}...")
print(f" User Prompt: {row['User Prompt'][:80]}...")
print(f" Response: {row['Response']}")
print("-" * 70)
Part 6: Initiate Reinforcement Learning Training¶
This is where your model starts getting smarter! Once you’ve uploaded the feedback CSV to your UBIAI project and labeled the examples, you can initiate reinforcement learning training.
Understanding KTO Training Parameters:¶
- training_type:
"reinforcement-kto"uses KTO algorithm for learning from positive/negative labels - num_iterations: More iterati better learning, but longer training time and higher cost
- batch_size: Number of examples processed together (4 is typical for 8B models)
- dropout: Regularization to prevent overfitting (0.35 is a good default)
- validation_split: Percentage held out for validation (20% is standard)
Before Running Training:¶
Upload the exported CSV to your UBIAI project
Label examples as “positive” or “negative” in UBIAI interface
Verify your UBIAI account has sufficient credits
Ensure the project has your base model selected
Set initiate_training = True below when ready:
# Training configuration
training_config = {
"training_type": "reinforcement-kto", # KTO reinforcement learning
"num_iterations": 20, # 20 epochs
"batch_size": 4, # 4 examples per batch
"dropout": 0.35, # 35% dropout for regularization
"validation_split": 20 # 20% validation set
}
print("Training Configuration:")
print("="*70)
for key, value in training_config.items():
print(f" {key}: {value}")
print("="*70)
print("\n⚠️ PRE-FLIGHT CHECKLIST:")
print(" ☐ Feedback CSV uploaded to UBIAI project")
print(" ☐ Examples labeled as positive/negative in UBIAI")
print(" ☐ UBIAI account has sufficient credits")
print(" ☐ Project ID and token are correct")
print("\n💡 Ready to train? Set 'initiate_training = True' in the next cell.")
# Set to True when ready to start training
initiate_training = False # Change to True to start training
if initiate_training:
print("\nInitiating reinforcement learning training...")
print("This may take several minutes to hours depending on dataset size.\n")
training_result = ubiai_client.initiate_training(**training_config)
print("="*70)
if training_result["success"]:
print("✓ TRAINING INITIATED SUCCESSFULLY!\n")
print("Training Response:")
print(json.dumps(training_result["response"], indent=2))
print("\n📊 Monitor training progress:")
print(f" 1. Visit: https://ubiai.tools")
print(f" 2. Navigate to Project ID: {UBIAI_PROJECT_ID}")
print(f" 3. Go to Models/Training tab")
print(f" 4. Watch for training completion notification")
print("\n⏰ Training typically takes 1-3 hours for small datasets")
print(" You'll receive an email when training completes.")
else:
print("✗ TRAINING INITIATION FAILED\n")
print(f"Error: {training_result['error']}")
print("\n🔍 Troubleshooting:")
print(" - Verify training data is uploaded to UBIAI project")
print(" - Check that examples are properly labeled")
print(" - Confirm your UBIAI token is valid")
print(" - Ensure project has sufficient credits")
print("="*70)
else:
print("\n⏸️ Training not initiated.")
print(" Set 'initiate_training = True' when ready.")
print("\n Remember: Upload and label your feedback in UBIAI first!")
Part 7: Monitoring Dashboard¶
While training runs on UBIAI’s servers, let’s review our current agent’s performance. This dashboard gives you a comprehensive view of how your system is performing and helps you track improvements over time.
def create_monitoring_dashboard():
"""
Create a comprehensive monitoring dashboard for the SQL agent.
"""
print("\n" + "="*70)
print("SQL AGENT MONITORING DASHBOARD")
print("="*70)
# Agent Statistics Section
print("\n📊 AGENT PERFORMANCE")
print("-" * 70)
agent_stats = agent.get_statistics()
print(f" Total Queries Processed: {agent_stats['total_queries']}")
print(f" Successful: {agent_stats['successful_queries']}")
print(f" Failed: {agent_stats['failed_queries']}")
print(f" Blocked (Security): {agent_stats['blocked_queries']}")
print(f" Success Rate: {agent_stats['success_rate']:.2%}")
# Feedback Statistics Section
print("\n📝 FEEDBACK COLLECTION")
print("-" * 70)
feedback_stats = feedback_collector.get_statistics()
if "message" in feedback_stats:
print(f" {feedback_stats['message']}")
else:
print(f" Total Feedback: {feedback_stats['total_feedback']}")
print(f" Positive: {feedback_stats['positive_feedback']}")
print(f" Negative: {feedback_stats['negative_feedback']}")
print(f" Model Accuracy: {feedback_stats['accuracy']:.2%}")
# Recent Activity Section
print("\n📋 RECENT QUERIES (Last 5)")
print("-" * 70)
if agent.query_log:
recent = agent.query_log[-5:]
for i, log in enumerate(recent, 1):
status = "✓" if log["validation_passed"] else "✗"
question_preview = log['question'][:50] + "..." if len(log['question']) > 50 else log['question']
print(f" {i}. {status} {question_preview}")
if log.get("error"):
print(f" Error: {log['error'][:60]}...")
else:
print(" No queries logged yet")
# Training Readiness Section
print("\n🎯 TRAINING READINESS")
print("-" * 70)
if "message" not in feedback_stats:
if feedback_stats['ready_for_retrain']:
print(" ✅ READY FOR RETRAINING!")
print(f" Collected: {feedback_stats['total_feedback']} examples")
print(f" Threshold: {feedback_collector.retrain_threshold} examples")
print("\n Next Steps:")
print(" 1. Export feedback (Part 5)")
print(" 2. Upload to UBIAI project")
print(" 3. Label examples in UBIAI")
print(" 4. Initiate training (Part 6)")
else:
remaining = feedback_stats['until_retrain']
current = feedback_stats['total_feedback']
threshold = feedback_collector.retrain_threshold
progress = (current / threshold) * 100 if threshold > 0 else 0
print(f" ⏳ Collection in progress...")
print(f" Need {remaining} more examples")
print(f" Progress: {current}/{threshold} ({progress:.1f}%)")
# Progress bar
bar_width = 40
filled = int(bar_width * progress / 100)
bar = "█" * filled + "░" * (bar_width - filled)
print(f" [{bar}] {progress:.1f}%")
else:
print(" No feedback collected yet")
# System Health Section
print("\n🏥 SYSTEM HEALTH")
print("-" * 70)
print(f" Agent Mode: {'Read-Only' if agent.read_only else 'Full Access'}")
print(f" Logging: {'Enabled' if agent.enable_logging else 'Disabled'}")
print(f" UBIAI Connection: Configured")
print(f" Project ID: {UBIAI_PROJECT_ID}")
print(f" Base Model: {BASE_MODEL}")
print("\n" + "="*70)
# Display the dashboard
create_monitoring_dashboard()
Part 8: Export Logs and Reports¶
For compliance, debugging, and analysis purposes, it’s important to export all logs and create summary reports. This data can be invaluable for understanding model behavior and tracking improvements over time.
# Create reports directory
reports_dir = Path("data/reports")
reports_dir.mkdir(exist_ok=True, parents=True)
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
print("Exporting logs and generating reports...\n")
# 1. Export agent query logs
agent_logs_path = f"data/reports/agent_logs_{timestamp}.csv"
agent.export_logs(agent_logs_path)
# 2. Export feedback data (if exists)
if feedback_collector.feedback_data:
feedback_path = f"data/reports/feedback_{timestamp}.csv"
df_feedback = pd.DataFrame(feedback_collector.feedback_data)
df_feedback.to_csv(feedback_path, index=False)
print(f"✓ Feedback exported to: {feedback_path}")
print(f" Total feedback entries: {len(df_feedback)}")
# 3. Create JSON summary report
summary_path = f"data/reports/summary_{timestamp}.json"
summary = {
"report_metadata": {
"generated_at": datetime.now().isoformat(),
"report_version": "1.0"
},
"agent_statistics": agent.get_statistics(),
"feedback_statistics": feedback_collector.get_statistics(),
"configuration": {
"ubiai_project_id": UBIAI_PROJECT_ID,
"base_model": BASE_MODEL,
"temperature": TEMPERATURE,
"read_only_mode": agent.read_only,
"retrain_threshold": feedback_collector.retrain_threshold
},
"training_readiness": {
"ready": feedback_collector.get_statistics().get("ready_for_retrain", False),
"feedback_collected": len(feedback_collector.feedback_data),
"threshold": feedback_collector.retrain_threshold
}
}
with open(summary_path, 'w') as f:
json.dump(summary, f, indent=2)
print(f"\n✓ Summary report exported to: {summary_path}")
print(f"\n{'='*70}")
print("EXPORT COMPLETE")
print(f"{'='*70}")
print(f"\nAll reports saved to: {reports_dir}/")
print(f"\nFiles created:")
print(f" 1. Agent query logs (CSV)")
print(f" 2. Feedback data (CSV)")
print(f" 3. Summary report (JSON)")
print(f"\nUse these files for:")
print(f" • Performance analysis")
print(f" • Compliance auditing")
print(f" • Debugging issues")
print(f" • Tracking improvements over time")
Conclusion: Your Journey to Production AI¶
Congratulations! You’ve just built a complete, production-ready SQL generation system with continuous improvement capabilities. Let’s recap what you’ve accomplished:
Resources:¶
- UBIAI Platform: https://ubiai.tools
- UBIAI Documentation: Check their docs for API updates
Remember:¶
The secret to a successful AI agent isn’t just building it—it’s continuously improving it based on real-world usage. Your model will get smarter with every retraining cycle, adapting to your specific use cases and user needs.
Start collecting feedback today, and watch your model evolve!