Solving the Agent Alignment Problem with Domain-Specific Fine-Tuning
The Agent Alignment Crisis
Why AI Agents Are Failing in Business
You’ve probably seen the hype around AI agents. They’re supposed to revolutionize how we work, answering questions, querying databases, writing reports. But here’s the truth: most AI agents fail spectacularly when deployed in real business environments.
Let me show you a real example.
A Typical Failure Scenario
Imagine your sales team asks a “simple” question:
User: "Show me all high-value customers who are at risk of churning."
Generic AI Agent Response:
Agent: "I'd be happy to help! Could you provide:
1. Your database schema
2. What 'high-value' means in your context
3. How you define 'churn risk'
4. The database type you're using"
Or worse, it generates this query:
{
"query": {
"value": "high",
"risk": "churning"
}
}
❌ This query is completely wrong. It doesn’t match your schema, doesn’t know your field names, and doesn’t understand your business logic.
Why This Happens: The 3 Core Problems
Problem 1: Generic Training Data
Foundation models like GPT-4 are trained on:
- Broad internet data
- Generic examples from thousands of different companies
- No knowledge of YOUR specific:
- Database schema
- Field naming conventions
- Business rules and KPIs
- Internal terminology
Result: The model has to guess, leading to hallucinations and errors.
Problem 2: The Context Window Trap
To make generic models work, you might try stuffing your entire database schema into the prompt:
context = """
Database: MongoDB
Collection: customers
Fields:
- customer_id: string (unique identifier)
- name: string (full name)
- email: string (contact email)
- lifetime_value: float (total $ spent)
- churn_risk_score: integer (1-10, where 7+ is high risk)
- status: string (active, inactive, suspended)
- acquisition_channel: string (organic_search, paid_ads, referral)
- last_purchase_date: datetime
- segment: string (consumer, smb, enterprise)
Business Rules:
- High value = lifetime_value > $10,000
- Churn risk = churn_risk_score >= 7
- Only consider active customers for churn analysis
...
""" # This is 2000+ tokens!
Problems with this approach:
- Expensive: 2000 tokens per query adds up fast
- Slow: More tokens = longer processing time
- Not scalable: Every query needs the full context
- Still unreliable: Model might still misinterpret or hallucinate
Problem 3: The Business Alignment Gap
Even with perfect prompting, generic models don’t truly “understand” your business:
- They don’t know that “high value” means 10K in YOUR company (it might be > 1K elsewhere)
- They don’t know your
churn_risk_scorefield exists - They don’t know you calculate churn risk on a 1-10 scale
- They don’t know to filter for
status: "active"
This is the alignment gap, the difference between what the model knows and what your business needs.
The Solution: Fine-Tuning for Domain Alignment
Fine-tuning agent components means customizing the building blocks of an AI agent, such as its reasoning, retrieval, and action modules, so they align with your specific domain and workflow. Instead of training an entire model from scratch, we refine only the parts that matter most: how the agent understands your data, interprets tasks, and decides what to do next.
What Happens When You Fine-Tune
Instead of this:
User: "Show me high-value customers at churn risk"
↓
[Generic Model] → "What's your schema?" ❌
You get this:
User: "Show me high-value customers at churn risk"
↓
[Fine-Tuned agent] → {
"query": {
"lifetime_value": {"$gte": 10000},
"churn_risk_score": {"$gte": 7},
"status": "active"
}
} ✅
The model now KNOWS:
- Your exact field names
- Your business definitions
- Your query patterns
- Your data structure
The Benefits: Why Fine-Tuning Changes Everything
| Metric | Generic Model | Fine-Tuned Model | Improvement |
|---|---|---|---|
| Query Accuracy | 45-60% | 90-95% | +60% |
| Hallucination Rate | 30-40% | 5-10% | -75% |
| Schema Compliance | 50% | 98% | +96% |
| Context Required | 2000+ tokens | 200-500 tokens | -75% |
| Response Time | 2-3 seconds | 1-2 seconds | -40% |
| Monthly Cost | 500-800 dollars | 150-250 dollars | -65% |
Based on 10,000 queries/month
What We’re Building Today
In this blog, we’ll build a CRM Intelligence Agent that actually works in production. Here’s the complete workflow:
┌───────────────────────────────────────────────────┐
│ User asks a natural language question │
│ "Find customers at high churn risk" │
└────────────────────┬──────────────────────────────┘
↓
┌───────────────────────────────────────────────────┐
│ FUNCTION 1: Translate NL → MongoDB Query │
│ FINE-TUNED on YOUR CRM data │
│ Output: {"churn_risk_score": {"$gte": 7}} │
└────────────────────┬──────────────────────────────┘
↓
┌───────────────────────────────────────────────────┐
│ FUNCTION 2: Execute Query on MongoDB │
│ Returns: 47 matching customer records │
└────────────────────┬──────────────────────────────┘
↓
┌────────────────────────────────────────────────────-┐
│ FUNCTION 3: Generate Executive Report │
│ AI analyzes data, creates Word document │
│ Includes insights, recommendations, visualizations │
└────────────────────┬──────────────────────────────--┘
↓
┌────────────────────────────────────────────────────┐
│ FUNCTION 4: Email Report to Stakeholders │
│ Sends to: sales-team@company.com │
│ customer-success@company.com │
└────────────────────────────────────────────────────┘
The key insight: Only Function 1 needs fine-tuning but of course you can choose to finetune other components as well.
Ready to Get Started?
Let’s get started! In the next section, we’ll prepare the training data that will teach our model about YOUR CRM system.
Step 1: Dataset Preparation
Understanding Instruction-Response Datasets
Fine-tuning requires instruction-response pairs, examples that show the model how to behave. Think of it like training a new employee:
Instruction (what the user asks):
"Find all customers who purchased in the last 30 days"
Response (what the model should output):
{
"query": {
"last_purchase_date": {
"$gte": ISODate("2024-10-05")
}
}
}
After seeing hundreds of these examples, the model learns:
- The field name is
last_purchase_date(notpurchase_dateorbought_date) - MongoDB uses
$gteoperator for “greater than or equal” - Dates are in ISODate format
- “Last 30 days” means calculating from today’s date
The Dataset We’re Using
We’ll use the letsrecap/Crm dataset from Hugging Face.
This is a curated collection of:
- 1000+ CRM-related questions
- Corresponding MongoDB queries
- Realistic business scenarios
- Proper query structure and syntax
Why this dataset is perfect:
- Domain-specific (CRM/customer data)
- Consistent format
- High-quality examples
- Free and open-source
🏗️ Step 2: Agent Architecture Design
Understanding Multi-Function Agents
Before we start with code, let’s understand what we’re building and WHY this architecture makes sense.
What is a Multi-Function Agent?
Think of an agent as an intelligent assistant that can use multiple “tools” (functions) to complete complex tasks. Each function is specialized for one job:
Agent = Orchestrator + Multiple Functions
The orchestrator (powered by GPT-4) decides:
- Which function to call
- In what order
- What parameters to pass
The functions are specialized workers:
- Function 1: Query translation (fine-tuned)
- Function 2: Database execution (standard code)
- Function 3: Report generation (standard AI)
- Function 4: Email delivery (standard code)
Our Agent’s Workflow (Detailed)
Let’s trace through exactly what happens when a user makes a request:
User Request:
"Find all high-value customers at risk of churning and email a report to the sales team"
What Happens Behind the Scenes:
Step 1: Agent receives request
Orchestrator (GPT-4): "I need to:
1. Translate the query to MongoDB
2. Execute the query
3. Generate a report
4. Email the report
Let me start with Function 1..."
Step 2: Call Function 1 (Fine-Tuned Model)
translate_nl_to_query(
natural_language_question="Find high-value customers at risk of churning",
additional_context="High value = >$10K lifetime value"
)
# Returns:
{
"query": {
"lifetime_value": {"$gte": 10000},
"churn_risk_score": {"$gte": 7},
"status": "active"
},
"sort": {"churn_risk_score": -1}
}
Step 3: Call Function 2 (Execute Query)
execute_mongodb_query(
query_object='{"query": {...}}'
)
# Returns:
{
"results": [...47 customer records...],
"count": 47
}
Step 4: Call Function 3 (Generate Report)
generate_executive_report(
query_results='{"results": [...], "count": 47}',
report_title="High-Value Churn Risk Analysis"
)
# Returns:
{
"filename": "Executive_Report_20241104.docx",
"success": True
}
Step 5: Call Function 4 (Send Email)
send_report_email(
recipient_emails="sales-team@company.com",
report_filename="Executive_Report_20241104.docx",
subject="Urgent: 47 High-Value Customers at Churn Risk"
)
# Returns:
{
"success": True,
"sent_at": "2024-11-04T10:30:00Z"
}
Step 6: Agent responds to user
"I've identified 47 high-value customers at churn risk and emailed a detailed
report to the sales team. The report includes customer details, risk scores,
and recommended actions."
Why This Architecture Works
1. Separation of Concerns
Each function does ONE thing well:
- Easier to test
- Easier to debug
- Easier to update
- Reusable in different workflows
2. Targeted Fine-Tuning
Only Function 1 needs fine-tuning:
- Lower cost: Fine-tune one specialized model
- Better results: Focus on the hardest problem
- Easy updates: Retrain just Function 1 when schema changes
3. Composability
Functions can be combined in different ways:
# Workflow 1: Query + Report + Email
translate → execute → report → email
# Workflow 2: Query + Save to database
translate → execute → save
# Workflow 3: Query + Slack notification
translate → execute → report → slack
4. Real-World Applicability
This mirrors how business intelligence actually works:
- Someone asks a question
- Analyst writes a query (now automated with fine-tuning)
- Query runs on database
- Results analyzed and reported
- Report shared with stakeholders
Visual Architecture Diagram
┌─────────────────────────────────────────────────────────────┐
│ USER INTERFACE │
│ Natural Language: "Show me customers at high churn risk" │
└──────────────────────┬──────────────────────────────────────┘
│
↓
┌─────────────────────────────────────────────────────────────┐
│ AGENT ORCHESTRATOR (GPT-4) │
│ Decides: Which functions to call and in what order │
└──────────────────────┬──────────────────────────────────────┘
│
┌───────────────┼───────────────┬────────────────┐
↓ ↓ ↓ ↓
┌─────────────┐ ┌─────────────┐ ┌──────────┐ ┌────────────┐
│ FUNCTION 1 │ │ FUNCTION 2 │ │FUNCTION 3│ │ FUNCTION 4 │
│ NL→Query │ │ Execute │ │ Report │ │ Email │
│ [FINE-TUNED]│ │ Query │ │ Generate │ │ Send │
└─────────────┘ └─────────────┘ └──────────┘ └────────────┘
🔥 🗄️ 📊 📧
Ready to Build?
Now that you understand the architecture, let’s fine-tune the model and implement these functions!
Step 3: Fine-Tuning with UBIAI
What is UBIAI?
UBIAI is an enterprise AI platform that makes fine-tuning agents accessible. Instead of:
- Setting up GPU infrastructure ❌
- Managing training scripts ❌
- Monitoring training jobs ❌
- Deploying models ❌
You simply:
- Upload your data ✅
- Click “Fine-Tune” ✅
- Get an API endpoint ✅
Think of it as “AWS for Agent Fine-Tuning”
Step 3.1: Create a UBIAI Account
If you don’t have a UBIAI account yet:
- Go to https://ubiai.tools
- Click “Sign Up”
- Choose the plan:
- Verify your email
You should now be on the UBIAI Dashboard.
Step 3.2: Upload Your Training Data
We created the file crm_finetuning_data.csv in Step 1. Now we’ll upload it to UBIAI.
In the UBIAI Dashboard:
- Navigate to “Fine-Tuning” in the left sidebar
- Click “New Model”
- You’ll see a form like this:
Just upload your data and click “Start Fine-Tuning” and you’ll see:
Step 3.3: Monitor Training Progress
UBIAI provides real-time monitoring. You’ll see a dashboard like this:
### Step 3.4: Training Complete—Get Your API Credentials
After 20-30 minutes, you'll see:
**Now, get your API credentials:**
1. Click on your fine-tuned model
2. Go to the **"API Access"** tab
3. You'll see:
IMPORTANT: Save these credentials! You’ll need them for the next step.
Step 3.5: Test Your Fine-Tuned Model
Before integrating into the agent, let’s test the fine-tuned model directly to see the improvement.
In the UBIAI Dashboard:
- Click “Playground”
- Enter a test query
Perfect! The modelis ready
Compare this to a generic model:
Cost Analysis
Let’s talk real numbers for a business scenario:
Scenario: 10,000 queries per month
Option 1: Generic Model (GPT-4 + Long Context)
Context per query: 2000 tokens (schema, examples, instructions)
Response: 200 tokens
Total per query: 2200 tokens
Monthly tokens: 22,000,000
Cost: ~$660/month (at $0.03/1K tokens)
Option 2: Fine-Tuned Model
Context per query: 200 tokens (just the question)
Response: 200 tokens
Total per query: 400 tokens
Monthly tokens: 4,000,000
Cost: ~$120/month + $5 fine-tuning = $125/month
Savings: $535/month or 81% cost reduction!
Plus:
- Better accuracy (90% vs 50%)
- Faster responses (fewer tokens to process)
- More reliable (fewer errors to debug)
Next Steps
Now that we have a fine-tuned model, let’s build the complete agent!
In the next section, we’ll implement all 4 functions and connect them into a working system.
Step 4: Building the Multi-Function Agent
Overview: What We’re Building
Now comes the exciting part, we’ll implement all 4 functions and wire them together into a working agent.
Here’s the plan:
- Configure API credentials
- Implement Function 1: NL → MongoDB Query (using your fine-tuned model)
- Implement Function 2: Execute MongoDB Query
- Implement Function 3: Generate Executive Report
- Implement Function 4: Send Report via Email
- Initialize the complete agent with LangChain
Let’s start!
Step 4.1: Configure Your Credentials
First, we need to set up all the API keys and connection strings.
⚠️ IMPORTANT: Replace the placeholder values with your actual credentials!
Install required packages This will take 1-2 minutes to complete !pip install -q datasets pandas pymongo langchain langchain-openai python-docx requests
import os import requests import json from typing import Dict, List, Any import pymongo from datetime import datetime from docx import Document from docx.shared import Inches, Pt, RGBColor from docx.enum.text import WD_ALIGN_PARAGRAPH import smtplib from email.mime.multipart import MIMEMultipart from email.mime.text import MIMEText from email.mime.base import MIMEBase from email import encoders
print(“🔧 CONFIGURATION SETUP\n”) print(“=”*80)
============================================
- MongoDB Configuration
- ============================================ print(“\n MongoDB Configuration”) print(” For local MongoDB: mongodb://localhost:27017/”) print(” For MongoDB Atlas: mongodb+srv://username:password@cluster.mongodb.net/\n”)
MONGODB_CONFIG = { “connection_string”: “mongodb://localhost:27017/”, # ⬅️ REPLACE if using Atlas “database”: “crm_database”, “collection”: “customers” }
print(f” ✅ Database: {MONGODB_CONFIG[‘database’]}”) print(f” ✅ Collection: {MONGODB_CONFIG[‘collection’]}”)
- ============================================
- Email Configuration
- ============================================ print(“\n Email Configuration”) print(” For Gmail, create an app password at: https://myaccount.google.com/apppasswords\n“)
EMAIL_CONFIG = { “smtp_server”: “smtp.gmail.com”, “smtp_port”: 587, “sender_email”: “your-email@gmail.com”, # ⬅️ REPLACE “sender_password”: “your-app-password” # ⬅️ REPLACE (use app password, not regular password) }
print(f” ✅ SMTP Server: {EMAIL_CONFIG[‘smtp_server’]}:{EMAIL_CONFIG[‘smtp_port’]}”) print(f” ✅ Sender: {EMAIL_CONFIG[‘sender_email’]}”)
============================================
OpenAI Configuration (for agent orchestration)
print(“\n OpenAI Configuration”) print(” Get your API key at: https://platform.openai.com/api-keys\n“)
OPENAI_API_KEY = “your-openai-api-key-here” # ⬅️ REPLACE
print(f” ✅ API Key: {OPENAI_API_KEY[:20]}…”)
print(“\n” + “=”*80) print(“\n✅ Configuration complete!”) print(“\n💡 Pro Tip: In production, use environment variables instead of hardcoded values:”) print(” os.environ[‘UBIAI_TOKEN’], os.environ[‘OPENAI_API_KEY’], etc.”)
Step 4.2: Implement Function 1 – Natural Language → MongoDB Query
This is the star of the show, the function that uses your fine-tuned model!
What it does:
- Takes a natural language question
- Sends it to your fine-tuned UBIAI model
- Gets back a perfectly formatted MongoDB query
- Returns db query
Let’s implement it:
from langchain.tools import tool import requests import json from typing import Dict, Any
print(“\n Implementing Function 1: NL → Query Translation\n”) print(“This function will use your fine-tuned model to convert”) print(“natural language questions into MongoDB queries.\n”)
@tool from langchain.tools import tool import requests import json from typing import Dict, Any
@tool def translate_nl_to_query(natural_language_question: str) -> Dict[str, Any]: “””Translate a natural language question into a MongoDB query using a fine-tuned UBIAI model.”””
url = "https://api.ubiai.tools:8443/api_v1/annotate"
my_token = ""
user_prompt = f"""Translate the following natural language question into a MongoDB query.
The output should be structured and ready to execute in MongoDB.
Question: {natural_language_question} “””
data = {
"input_text": "",
"system_prompt": "You are an expert at converting natural language into MongoDB queries.",
"user_prompt": user_prompt,
"temperature": 0.0,
"monitor_model": True,
"knowledge_base_ids": [],
"session_id": "",
"images_urls": []
}
try:
response = requests.post(url + my_token, json=data)
if response.status_code == 200:
res = response.json()
return {"query": res.get("output")}
else:
return {"error": f"{response.status_code} - {response.text}"}
except Exception as e:
return {"error": str(e)}
Step 4.3: Implement Function 2 – Execute MongoDB Query
Now that we can generate queries, we need to execute them on our database.
What this function does:
- Takes the query object from Function 1
- Connects to MongoDB
- Executes the query
- Returns the results
This is standard Python code, no AI needed here!
print(“\n🗄️ Implementing Function 2: Execute MongoDB Query\n”) print(“This function connects to MongoDB and executes the generated queries.\n”)
@tool def execute_mongodb_query( query_object: str, # Passed as JSON string for LangChain compatibility database_name: str = MONGODB_CONFIG[“database”], collection_name: str = MONGODB_CONFIG[“collection”] ) -> Dict[str, Any]: “”” Executes a MongoDB query and returns the results.
This function takes the query generated by Function 1 and runs it against
your MongoDB database. It handles all the connection logic, error cases,
and result formatting.
Args:
query_object: JSON string containing query, projection, sort, and limit
Example: '{"query": {"status": "active"}, "limit": 10}'
database_name: MongoDB database name (defaults to config)
collection_name: MongoDB collection name (defaults to config)
Returns:
Dict containing:
- success: Boolean indicating if query succeeded
- results: List of documents matching the query
- count: Number of results returned
- total_matching: Total documents matching (before limit)
- query_info: Metadata about the query execution
Example:
>>> execute_mongodb_query('{"query": {"status": "active"}, "limit": 5}')
{
"success": True,
"results": [{...}, {...}, ...],
"count": 5,
"total_matching": 247
}
"""
print(f"\n🔌 Connecting to MongoDB...")
print(f" Database: {database_name}")
print(f" Collection: {collection_name}")
try:
* Parse query object if it's a string
if isinstance(query_object, str):
query_params = json.loads(query_object)
else:
query_params = query_object
* Extract query components
query_filter = query_params.get("query", {})
projection = query_params.get("projection", None)
sort_criteria = query_params.get("sort", None)
limit = query_params.get("limit", 100)
print(f"\n 📊 Query Filter:")
print(f" {json.dumps(query_filter, indent=6)}")
* Connect to MongoDB
print(f"\n ⚙️ Executing query...")
client = pymongo.MongoClient(MONGODB_CONFIG["connection_string"])
db = client[database_name]
collection = db[collection_name]
* Build and execute query
cursor = collection.find(query_filter, projection)
if sort_criteria:
cursor = cursor.sort(list(sort_criteria.items()))
if limit:
cursor = cursor.limit(limit)
* Get results
results = list(cursor)
* Convert ObjectId to string for JSON serialization
for doc in results:
if '_id' in doc:
doc['_id'] = str(doc['_id'])
* Get total count
total_count = collection.count_documents(query_filter)
client.close()
print(f" Query executed successfully!")
print(f" Results: {len(results)} returned (Total matching: {total_count})")
return {
"success": True,
"results": results,
"count": len(results),
"total_matching": total_count,
"query_info": {
"database": database_name,
"collection": collection_name,
"filter": query_filter,
"executed_at": datetime.now().isoformat()
}
}
except pymongo.errors.ConnectionFailure:
print(" ❌ Failed to connect to MongoDB")
return {
"success": False,
"error": "Failed to connect to MongoDB. Check connection string and ensure MongoDB is running.",
"results": [],
"count": 0
}
except Exception as e:
print(f" ❌ Query failed: {str(e)}")
return {
"success": False,
"error": f"Query execution failed: {str(e)}",
"results": [],
"count": 0
}
print(” Function 2 implemented!”) print(“\n Note: To test this function, you need MongoDB running with sample data.”) print(” We’ll set up sample data in the next cell.”)
Step 4.4: Create Sample MongoDB Data (Optional)
If you don’t have a MongoDB database with CRM data, let’s create sample data for testing.
Skip this cell if you already have a database set up!
This will create 5 sample customers with realistic data:
def setup_sample_crm_database(): “”” Creates sample CRM data in MongoDB for testing.
This function will:
1. Connect to MongoDB
2. Clear any existing test data
3. Insert 5 realistic customer records
Run this once to populate test data.
"""
print("\n🗄️ Setting up sample CRM database...\n")
try:
client = pymongo.MongoClient(MONGODB_CONFIG["connection_string"])
db = client[MONGODB_CONFIG["database"]]
collection = db[MONGODB_CONFIG["collection"]]
* Clear existing data
print(" 🧹 Clearing existing data...")
collection.delete_many({})
* Create realistic sample customer data
sample_customers = [
{
"customer_id": "CUST001",
"name": "Alice Johnson",
"email": "alice.johnson@example.com",
"status": "active",
"lifetime_value": 15000,
"churn_risk_score": 8, # High risk!
"last_purchase_date": datetime(2024, 1, 15),
"acquisition_channel": "organic_search",
"segment": "enterprise",
"created_at": datetime(2022, 3, 10)
},
{
"customer_id": "CUST002",
"name": "Bob Smith",
"email": "bob.smith@example.com",
"status": "active",
"lifetime_value": 8500,
"churn_risk_score": 3, # Low risk
"last_purchase_date": datetime(2024, 10, 20),
"acquisition_channel": "paid_ads",
"segment": "smb",
"created_at": datetime(2023, 1, 5)
},
{
"customer_id": "CUST003",
"name": "Carol Davis",
"email": "carol.davis@example.com",
"status": "active",
"lifetime_value": 22000, # High value!
"churn_risk_score": 9, # High risk!
"last_purchase_date": datetime(2023, 11, 5),
"acquisition_channel": "referral",
"segment": "enterprise",
"created_at": datetime(2021, 6, 20)
},
{
"customer_id": "CUST004",
"name": "David Lee",
"email": "david.lee@example.com",
"status": "active",
"lifetime_value": 4500,
"churn_risk_score": 2, # Low risk
"last_purchase_date": datetime(2024, 10, 28),
"acquisition_channel": "organic_search",
"segment": "consumer",
"created_at": datetime(2024, 8, 12)
},
{
"customer_id": "CUST005",
"name": "Emma Wilson",
"email": "emma.wilson@example.com",
"status": "inactive",
"lifetime_value": 12000,
"churn_risk_score": 10, # Already churned
"last_purchase_date": datetime(2023, 5, 15),
"acquisition_channel": "paid_ads",
"segment": "smb",
"created_at": datetime(2022, 11, 3)
}
]
# Insert sample data
print(" Inserting sample customer records...")
result = collection.insert_many(sample_customers)
print(f"\n Success! Inserted {len(result.inserted_ids)} sample customers")
print(f"\n Database Info:")
print(f" • Database: {MONGODB_CONFIG['database']}")
print(f" • Collection: {MONGODB_CONFIG['collection']}")
print(f" • Total documents: {collection.count_documents({})}")
print(f"\n Sample Customers:")
for customer in sample_customers:
print(f" • {customer['name']} - ${customer['lifetime_value']:,} LTV, Risk: {customer['churn_risk_score']}/10")
client.close()
return True
except Exception as e:
print(f"\n Failed to setup database: {str(e)}")
print(f"\n Make sure MongoDB is running:")
print(f" • Local: Start with 'mongod' command")
print(f" • Atlas: Check your connection string")
return False
Uncomment the next line to create sample data
setup_sample_crm_database()
print(“\n To create sample data, uncomment the line above and run this cell.”)
Checkpoint: What We’ve Built So Far
Great progress! Let’s recap:
Function 1: Translates natural language → MongoDB queries (using fine-tuned model)
Function 2: Executes queries on MongoDB
Sample Data: Created realistic test customers (optional)
Next up: Functions 3 & 4, then we’ll wire everything together!
Step 4.5: Generate Executive Report
Now we have query results from MongoDB. Let’s turn them into professional business reports!
@tool def generate_executive_report( query_results: str, # JSON string of query results report_title: str, analysis_prompt: str = “Analyze these customer records and provide strategic insights”, filename: str = None ) -> Dict[str, Any]: “”” Generates a professional executive report from query results.
Args:
query_results: JSON string containing query results
report_title: Title for the report
analysis_prompt: Specific analysis instructions
filename: Output filename (auto-generated if not provided)
Returns:
Dict containing report filename and summary
"""
print(f"\n📝 Generating executive report: '{report_title}'")
try:
* Parse query results
if isinstance(query_results, str):
results_data = json.loads(query_results)
else:
results_data = query_results
results = results_data.get("results", [])
count = results_data.get("count", 0)
* Generate insights using UBIAI API
insights_prompt = f"""{analysis_prompt}
Data Summary:
- Total Records: {count}
- Sample Data: {json.dumps(results[:5], indent=2)}
Provide:
- Executive Summary (2-3 sentences)
- Key Findings (3-5 bullet points)
- Strategic Recommendations (3-5 action items)
- Risk Assessment
- Next Steps
Format as a professional business report.”””
payload = {
"input_text": "",
"system_prompt": "You are a business intelligence analyst creating executive reports.",
"user_prompt": insights_prompt,
"temperature": 0.7,
"monitor_model": True,
"knowledge_base_ids": [],
"session_id": "",
"images_urls": []
}
response = requests.post(
UBIAI_CONFIG["api_url"] + UBIAI_CONFIG["model_token"],
json=payload,
timeout=60
)
if response.status_code != 200:
return {"error": f"Failed to generate insights: {response.status_code}"}
insights = response.json().get("output", "No insights generated")
* Create Word document
if filename is None:
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
filename = f"Executive_Report_{timestamp}.docx"
doc = Document()
* Title
title = doc.add_heading(report_title, 0)
title.alignment = WD_ALIGN_PARAGRAPH.CENTER
* Metadata
doc.add_paragraph(f"Generated: {datetime.now().strftime('%B %d, %Y at %I:%M %p')}")
doc.add_paragraph(f"Total Records Analyzed: {count}")
doc.add_paragraph()
* Insights
doc.add_heading('Analysis & Insights', 1)
doc.add_paragraph(insights)
* Data table
doc.add_heading('Detailed Data', 1)
if results:
* Create table
sample_size = min(10, len(results))
keys = list(results[0].keys())
table = doc.add_table(rows=1, cols=len(keys))
table.style = 'Light Grid Accent 1'
* Header row
header_cells = table.rows[0].cells
for i, key in enumerate(keys):
header_cells[i].text = key.replace('_', ' ').title()
# Data rows (first 10)
for record in results[:sample_size]:
row_cells = table.add_row().cells
for i, key in enumerate(keys):
value = record.get(key, '')
row_cells[i].text = str(value)
if len(results) > sample_size:
doc.add_paragraph(f"\n(Showing {sample_size} of {count} total records)")
* Footer
doc.add_paragraph()
footer = doc.add_paragraph("―" * 50)
footer.add_run("\nGenerated by CRM Intelligence Agent | Powered by UBIAI")
* Save document
doc.save(filename)
print(f"✅ Report generated: {filename}")
return {
"success": True,
"filename": filename,
"summary": f"Report contains {count} records with detailed analysis",
"insights_preview": insights[:200] + "..."
}
except Exception as e:
return {
"success": False,
"error": f"Failed to generate report: {str(e)}"
}
print(“\n✅ Function 3 (generate_executive_report) defined”)
Step 4.6: Send Report via Email
And finally we make a function that sends emails to the sales team.
@tool def send_report_email( recipient_emails: str, # Comma-separated email addresses report_filename: str, subject: str, message_body: str = None, cc_emails: str = None ) -> Dict[str, Any]: “”” Sends an executive report via email to specified recipients.
Args:
recipient_emails: Comma-separated list of recipient email addresses
report_filename: Path to the report file to attach
subject: Email subject line
message_body: Optional custom email body
cc_emails: Optional comma-separated CC recipients
Returns:
Dict containing delivery status and recipient info
"""
print(f"\n📧 Sending report via email...")
try:
* Parse recipient emails
recipients = [email.strip() for email in recipient_emails.split(',')]
cc_list = [email.strip() for email in cc_emails.split(',')] if cc_emails else []
* Create email message
msg = MIMEMultipart()
msg['From'] = EMAIL_CONFIG['sender_email']
msg['To'] = ', '.join(recipients)
if cc_list:
msg['Cc'] = ', '.join(cc_list)
msg['Subject'] = subject
msg['Date'] = datetime.now().strftime('%a, %d %b %Y %H:%M:%S %z')
* Email body
if message_body is None:
message_body = f"""Dear Team,
Please find attached the CRM Intelligence Report generated on {datetime.now().strftime(‘%B %d, %Y’)}.
This report contains critical insights from our customer database analysis. Please review the key findings and recommended actions.
Key highlights: • Data-driven insights from recent customer analysis • Strategic recommendations for customer retention • Risk assessment and mitigation strategies
For questions or further analysis, please contact the Business Intelligence team.
Best regards, CRM Intelligence Agent Powered by UBIAI
――――――――――――――――――――――――――――――― This is an automated report. Please do not reply to this email. “””
msg.attach(MIMEText(message_body, 'plain'))
# Attach report file
try:
with open(report_filename, 'rb') as attachment:
part = MIMEBase('application', 'octet-stream')
part.set_payload(attachment.read())
encoders.encode_base64(part)
part.add_header(
'Content-Disposition',
f'attachment; filename= {os.path.basename(report_filename)}'
)
msg.attach(part)
except FileNotFoundError:
return {
"success": False,
"error": f"Report file not found: {report_filename}"
}
* Send email
server = smtplib.SMTP(EMAIL_CONFIG['smtp_server'], EMAIL_CONFIG['smtp_port'])
server.starttls()
server.login(EMAIL_CONFIG['sender_email'], EMAIL_CONFIG['sender_password'])
all_recipients = recipients + cc_list
server.sendmail(EMAIL_CONFIG['sender_email'], all_recipients, msg.as_string())
server.quit()
print(f"✅ Email sent successfully to {len(recipients)} recipient(s)")
return {
"success": True,
"recipients": recipients,
"cc": cc_list,
"subject": subject,
"attachment": os.path.basename(report_filename),
"sent_at": datetime.now().isoformat()
}
except smtplib.SMTPAuthenticationError:
return {
"success": False,
"error": "Email authentication failed. Check sender_email and sender_password in EMAIL_CONFIG."
}
except Exception as e:
return {
"success": False,
"error": f"Failed to send email: {str(e)}"
}
print(“\n✅ Function 4 (send_report_email) defined”)
Step 5: Initialize the Complete Agent
All that is left to do is put all the agent components together
from langchain_openai import ChatOpenAI from langchain.agents import AgentType, initialize_agent from langchain.prompts.chat import ChatPromptTemplate, MessagesPlaceholder from langchain.memory import ConversationBufferMemory
Initialize memory
memory = ConversationBufferMemory(memory_key=”chat_history”, return_messages=True)
Define all tools
tools = [ translate_nl_to_query, execute_mongodb_query, generate_executive_report, send_report_email ]
Initialize LLM (using GPT-4 for agent orchestration) llm = ChatOpenAI( model=”gpt-4″, openai_api_key=”your-openai-api-key-here”, * Update this temperature=0.3 )
Create system prompt system_prompt = “””You are a CRM Intelligence Agent specializing in database analysis and reporting.
Your workflow:
- Use translate_nl_to_query() to convert natural language questions into MongoDB queries
- Use execute_mongodb_query() to run the query and get results
- Use generate_executive_report() to create professional reports from the data
- Use send_report_email() to deliver reports to stakeholders
Always:
- Confirm actions before executing irreversible operations (sending emails)
- Provide clear summaries at each step
- Handle errors gracefully and suggest alternatives
Use the fine-tuned query translation model for accurate database queries “””
Create prompt template prompt = ChatPromptTemplate.from_messages([ (“system”, system_prompt), MessagesPlaceholder(variable_name=”chat_history”), (“human”, “{input}”), MessagesPlaceholder(variable_name=”agent_scratchpad”), ])
Initialize agent agent = initialize_agent( tools=tools, llm=llm, agent=AgentType.OPENAI_FUNCTIONS, verbose=True, memory=memory, agent_kwargs={
"prompt": prompt} )
print(“\n✅ CRM Intelligence Agent initialized successfully!”) print(“\n🤖 Available capabilities:”) print(” 1. Natural language → MongoDB query translation (fine-tuned)”) print(” 2. Database query execution”) print(” 3. Executive report generation”) print(” 4. Automated email delivery”)
Step 6: Practical Demonstration
Let’s see the complete agent in action with a real scenario!
Churn Risk Analysis
print(“\n” + “=”80) print(“SCENARIO 1: HIGH-VALUE CHURN RISK ANALYSIS”) print(“=”80 + “\n”)
query_1 = “”” I need to identify all customers who:
- Have a lifetime value greater than $10,000
- Have a churn risk score of 7 or higher
- Are currently active
Please create a report titled ‘High-Value Churn Risk Analysis’ and send it to sales-team@company.com and customer-success@company.com “””
try: response_1 = agent({“input”: query_1}) print(“\n Agent Response:”) print(response_1[“output”]) except Exception as e: print(f”\n❌ Error: {str(e)}”) print(“\nNote: Ensure MongoDB is running and email credentials are configured.”)
Performance Comparison: Generic vs Fine-Tuned Agent
Generic Agent Results:
❌ Query Accuracy: ~60% ❌ Schema Understanding: Poor (often uses wrong field names) ❌ Query Optimization: Suboptimal (slow queries) ❌ Business Logic: Doesn’t understand company-specific rules ❌ Hallucination Rate: High (makes up collection names)
Final Thoughts
Generic AI agents are powerful, but fine-tuned agents are game-changers for business applications. By embedding domain knowledge directly into the model, you create tools that:
- Actually understand your business
- Provide reliable, consistent results
- Cost less to operate
- Empower non-technical users
The future of AI in enterprise isn’t just about bigger models, it’s about smarter, specialized agents that truly align with your business goals.
🙏 Thank You!
If you found this tutorial helpful:
- 📤 Share with your team
- Notebook: https://colab.research.google.com/drive/147PZEqz9FEQC9S-k02gQtFSik13m1Ze0
Questions? Please reach out at admin@ubiai.tools!