Fine-Tuning a CRM Query Agent: From Natural Language to Database Intelligence

November 5, 2025

 

Solving the Agent Alignment Problem with Domain-Specific Fine-Tuning


The Agent Alignment Crisis

Why AI Agents Are Failing in Business

You’ve probably seen the hype around AI agents. They’re supposed to revolutionize how we work, answering questions, querying databases, writing reports. But here’s the truth: most AI agents fail spectacularly when deployed in real business environments.

Let me show you a real example.

A Typical Failure Scenario

Imagine your sales team asks a “simple” question:

User: "Show me all high-value customers who are at risk of churning."

Generic AI Agent Response:

Agent: "I'd be happy to help! Could you provide:
        1. Your database schema
        2. What 'high-value' means in your context
        3. How you define 'churn risk'
        4. The database type you're using"

Or worse, it generates this query:

{
  "query": {
    "value": "high",
    "risk": "churning"
  }
}

❌ This query is completely wrong. It doesn’t match your schema, doesn’t know your field names, and doesn’t understand your business logic.

Why This Happens: The 3 Core Problems

Problem 1: Generic Training Data

Foundation models like GPT-4 are trained on:

  • Broad internet data
  • Generic examples from thousands of different companies
  • No knowledge of YOUR specific:
    • Database schema
    • Field naming conventions
    • Business rules and KPIs
    • Internal terminology

Result: The model has to guess, leading to hallucinations and errors.

Problem 2: The Context Window Trap

To make generic models work, you might try stuffing your entire database schema into the prompt:

context = """
Database: MongoDB
Collection: customers
Fields:
  - customer_id: string (unique identifier)
  - name: string (full name)
  - email: string (contact email)
  - lifetime_value: float (total $ spent)
  - churn_risk_score: integer (1-10, where 7+ is high risk)
  - status: string (active, inactive, suspended)
  - acquisition_channel: string (organic_search, paid_ads, referral)
  - last_purchase_date: datetime
  - segment: string (consumer, smb, enterprise)

Business Rules:
  - High value = lifetime_value > $10,000
  - Churn risk = churn_risk_score >= 7
  - Only consider active customers for churn analysis
...
"""  # This is 2000+ tokens!

Problems with this approach:

  • Expensive: 2000 tokens per query adds up fast
  • Slow: More tokens = longer processing time
  • Not scalable: Every query needs the full context
  • Still unreliable: Model might still misinterpret or hallucinate

Problem 3: The Business Alignment Gap

Even with perfect prompting, generic models don’t truly “understand” your business:

  • They don’t know that “high value” means 10K in YOUR company (it might be > 1K elsewhere)
  • They don’t know your churn_risk_score field exists
  • They don’t know you calculate churn risk on a 1-10 scale
  • They don’t know to filter for status: "active"

This is the alignment gap, the difference between what the model knows and what your business needs.

The Solution: Fine-Tuning for Domain Alignment

Fine-tuning agent components means customizing the building blocks of an AI agent, such as its reasoning, retrieval, and action modules, so they align with your specific domain and workflow. Instead of training an entire model from scratch, we refine only the parts that matter most: how the agent understands your data, interprets tasks, and decides what to do next.

What Happens When You Fine-Tune

Instead of this:

User: "Show me high-value customers at churn risk"
        ↓
[Generic Model] → "What's your schema?" ❌

You get this:

User: "Show me high-value customers at churn risk"
        ↓
[Fine-Tuned agent] → {
  "query": {
    "lifetime_value": {"$gte": 10000},
    "churn_risk_score": {"$gte": 7},
    "status": "active"
  }
} ✅

The model now KNOWS:

  • Your exact field names
  • Your business definitions
  • Your query patterns
  • Your data structure

The Benefits: Why Fine-Tuning Changes Everything

MetricGeneric ModelFine-Tuned ModelImprovement
Query Accuracy45-60%90-95%+60%
Hallucination Rate30-40%5-10%-75%
Schema Compliance50%98%+96%
Context Required2000+ tokens200-500 tokens-75%
Response Time2-3 seconds1-2 seconds-40%
Monthly Cost500-800 dollars150-250 dollars-65%

Based on 10,000 queries/month

What We’re Building Today

In this blog, we’ll build a CRM Intelligence Agent that actually works in production. Here’s the complete workflow:

┌───────────────────────────────────────────────────┐
│  User asks a natural language question            │
│  "Find customers at high churn risk"              │
└────────────────────┬──────────────────────────────┘
                     ↓
┌───────────────────────────────────────────────────┐
│  FUNCTION 1: Translate NL → MongoDB Query         │
│  FINE-TUNED on YOUR CRM data                      │
│  Output: {"churn_risk_score": {"$gte": 7}}        │
└────────────────────┬──────────────────────────────┘
                     ↓
┌───────────────────────────────────────────────────┐
│  FUNCTION 2: Execute Query on MongoDB             │
│  Returns: 47 matching customer records            │
└────────────────────┬──────────────────────────────┘
                     ↓
┌────────────────────────────────────────────────────-┐
│  FUNCTION 3: Generate Executive Report              │
│  AI analyzes data, creates Word document            │
│  Includes insights, recommendations, visualizations │
└────────────────────┬──────────────────────────────--┘
                     ↓
┌────────────────────────────────────────────────────┐
│  FUNCTION 4: Email Report to Stakeholders          │
│  Sends to: sales-team@company.com                  │
│            customer-success@company.com            │
└────────────────────────────────────────────────────┘

The key insight: Only Function 1 needs fine-tuning but of course you can choose to finetune other components as well.

Ready to Get Started?

Let’s get started! In the next section, we’ll prepare the training data that will teach our model about YOUR CRM system.


Step 1: Dataset Preparation

Understanding Instruction-Response Datasets

Fine-tuning requires instruction-response pairs, examples that show the model how to behave. Think of it like training a new employee:

Instruction (what the user asks):
"Find all customers who purchased in the last 30 days"

Response (what the model should output):
{
  "query": {
    "last_purchase_date": {
      "$gte": ISODate("2024-10-05")
    }
  }
}

After seeing hundreds of these examples, the model learns:

  • The field name is last_purchase_date (not purchase_date or bought_date)
  • MongoDB uses $gte operator for “greater than or equal”
  • Dates are in ISODate format
  • “Last 30 days” means calculating from today’s date

The Dataset We’re Using

We’ll use the letsrecap/Crm dataset from Hugging Face.

This is a curated collection of:

  • 1000+ CRM-related questions
  • Corresponding MongoDB queries
  • Realistic business scenarios
  • Proper query structure and syntax

Why this dataset is perfect:

  1. Domain-specific (CRM/customer data)
  2. Consistent format
  3. High-quality examples
  4. Free and open-source

🏗️ Step 2: Agent Architecture Design

Understanding Multi-Function Agents

Before we start with code, let’s understand what we’re building and WHY this architecture makes sense.

What is a Multi-Function Agent?

Think of an agent as an intelligent assistant that can use multiple “tools” (functions) to complete complex tasks. Each function is specialized for one job:

Agent = Orchestrator + Multiple Functions

The orchestrator (powered by GPT-4) decides:

  • Which function to call
  • In what order
  • What parameters to pass

The functions are specialized workers:

  • Function 1: Query translation (fine-tuned)
  • Function 2: Database execution (standard code)
  • Function 3: Report generation (standard AI)
  • Function 4: Email delivery (standard code)

Our Agent’s Workflow (Detailed)

Let’s trace through exactly what happens when a user makes a request:

User Request:

"Find all high-value customers at risk of churning and email a report to the sales team"

What Happens Behind the Scenes:

Step 1: Agent receives request

Orchestrator (GPT-4): "I need to:
  1. Translate the query to MongoDB
  2. Execute the query
  3. Generate a report
  4. Email the report

Let me start with Function 1..."

Step 2: Call Function 1 (Fine-Tuned Model)

translate_nl_to_query(
  natural_language_question="Find high-value customers at risk of churning",
  additional_context="High value = >$10K lifetime value"
)

# Returns:
{
  "query": {
    "lifetime_value": {"$gte": 10000},
    "churn_risk_score": {"$gte": 7},
    "status": "active"
  },
  "sort": {"churn_risk_score": -1}
}

Step 3: Call Function 2 (Execute Query)

execute_mongodb_query(
  query_object='{"query": {...}}'
)

# Returns:
{
  "results": [...47 customer records...],
  "count": 47
}

Step 4: Call Function 3 (Generate Report)

generate_executive_report(
  query_results='{"results": [...], "count": 47}',
  report_title="High-Value Churn Risk Analysis"
)

# Returns:
{
  "filename": "Executive_Report_20241104.docx",
  "success": True
}

Step 5: Call Function 4 (Send Email)

send_report_email(
  recipient_emails="sales-team@company.com",
  report_filename="Executive_Report_20241104.docx",
  subject="Urgent: 47 High-Value Customers at Churn Risk"
)

# Returns:
{
  "success": True,
  "sent_at": "2024-11-04T10:30:00Z"
}

Step 6: Agent responds to user

"I've identified 47 high-value customers at churn risk and emailed a detailed
report to the sales team. The report includes customer details, risk scores,
and recommended actions."

Why This Architecture Works

1. Separation of Concerns

Each function does ONE thing well:

  • Easier to test
  • Easier to debug
  • Easier to update
  • Reusable in different workflows

2. Targeted Fine-Tuning

Only Function 1 needs fine-tuning:

  • Lower cost: Fine-tune one specialized model
  • Better results: Focus on the hardest problem
  • Easy updates: Retrain just Function 1 when schema changes

3. Composability

Functions can be combined in different ways:

# Workflow 1: Query + Report + Email
translate → execute → report → email

# Workflow 2: Query + Save to database
translate → execute → save

# Workflow 3: Query + Slack notification
translate → execute → report → slack

4. Real-World Applicability

This mirrors how business intelligence actually works:

  1. Someone asks a question
  2. Analyst writes a query (now automated with fine-tuning)
  3. Query runs on database
  4. Results analyzed and reported
  5. Report shared with stakeholders

Visual Architecture Diagram

┌─────────────────────────────────────────────────────────────┐
│                    USER INTERFACE                           │
│  Natural Language: "Show me customers at high churn risk"   │
└──────────────────────┬──────────────────────────────────────┘
                       │
                       ↓
┌─────────────────────────────────────────────────────────────┐
│              AGENT ORCHESTRATOR (GPT-4)                     │
│  Decides: Which functions to call and in what order         │
└──────────────────────┬──────────────────────────────────────┘
                       │
       ┌───────────────┼───────────────┬────────────────┐
       ↓               ↓               ↓                ↓
┌─────────────┐ ┌─────────────┐ ┌──────────┐ ┌────────────┐
│ FUNCTION 1  │ │ FUNCTION 2  │ │FUNCTION 3│FUNCTION 4 │
│ NL→Query    │ │ Execute     │ │ Report   │ │ Email      │
│ [FINE-TUNED]│ │ Query       │ │ Generate │ │ Send       │
└─────────────┘ └─────────────┘ └──────────┘ └────────────┘
      🔥              🗄️            📊             📧

Ready to Build?

Now that you understand the architecture, let’s fine-tune the model and implement these functions!


Step 3: Fine-Tuning with UBIAI

What is UBIAI?

UBIAI is an enterprise AI platform that makes fine-tuning agents accessible. Instead of:

  • Setting up GPU infrastructure ❌
  • Managing training scripts ❌
  • Monitoring training jobs ❌
  • Deploying models ❌

You simply:

  1. Upload your data ✅
  2. Click “Fine-Tune” ✅
  3. Get an API endpoint ✅

Think of it as “AWS for Agent Fine-Tuning”

Step 3.1: Create a UBIAI Account

If you don’t have a UBIAI account yet:

  1. Go to https://ubiai.tools
  2. Click “Sign Up”
  3. Choose the plan:
  4. Verify your email

You should now be on the UBIAI Dashboard.

Step 3.2: Upload Your Training Data

We created the file crm_finetuning_data.csv in Step 1. Now we’ll upload it to UBIAI.

In the UBIAI Dashboard:

  1. Navigate to “Fine-Tuning” in the left sidebar
  2. Click “New Model”
  3. You’ll see a form like this:

image.png

Just upload your data and click “Start Fine-Tuning” and you’ll see:

Step 3.3: Monitor Training Progress

UBIAI provides real-time monitoring. You’ll see a dashboard like this:

image.png



### Step 3.4: Training Complete—Get Your API Credentials

After 20-30 minutes, you'll see:

image.png


**Now, get your API credentials:**

1. Click on your fine-tuned model
2. Go to the **"API Access"** tab
3. You'll see:

image.png

IMPORTANT: Save these credentials! You’ll need them for the next step.

Step 3.5: Test Your Fine-Tuned Model

Before integrating into the agent, let’s test the fine-tuned model directly to see the improvement.

In the UBIAI Dashboard:

  1. Click “Playground”
  2. Enter a test query

Perfect! The modelis ready

Compare this to a generic model:

Cost Analysis

Let’s talk real numbers for a business scenario:

Scenario: 10,000 queries per month

Option 1: Generic Model (GPT-4 + Long Context)

Context per query: 2000 tokens (schema, examples, instructions)
Response: 200 tokens
Total per query: 2200 tokens
Monthly tokens: 22,000,000
Cost: ~$660/month (at $0.03/1K tokens)

Option 2: Fine-Tuned Model

Context per query: 200 tokens (just the question)
Response: 200 tokens
Total per query: 400 tokens
Monthly tokens: 4,000,000
Cost: ~$120/month + $5 fine-tuning = $125/month

Savings: $535/month or 81% cost reduction!

Plus:

  • Better accuracy (90% vs 50%)
  • Faster responses (fewer tokens to process)
  • More reliable (fewer errors to debug)

Next Steps

Now that we have a fine-tuned model, let’s build the complete agent!

In the next section, we’ll implement all 4 functions and connect them into a working system.

Step 4: Building the Multi-Function Agent

Overview: What We’re Building

Now comes the exciting part, we’ll implement all 4 functions and wire them together into a working agent.

Here’s the plan:

  1. Configure API credentials
  2. Implement Function 1: NL → MongoDB Query (using your fine-tuned model)
  3. Implement Function 2: Execute MongoDB Query
  4. Implement Function 3: Generate Executive Report
  5. Implement Function 4: Send Report via Email
  6. Initialize the complete agent with LangChain

Let’s start!

Step 4.1: Configure Your Credentials

First, we need to set up all the API keys and connection strings.

⚠️ IMPORTANT: Replace the placeholder values with your actual credentials!

Install required packages This will take 1-2 minutes to complete !pip install -q datasets pandas pymongo langchain langchain-openai python-docx requests

import os import requests import json from typing import Dict, List, Any import pymongo from datetime import datetime from docx import Document from docx.shared import Inches, Pt, RGBColor from docx.enum.text import WD_ALIGN_PARAGRAPH import smtplib from email.mime.multipart import MIMEMultipart from email.mime.text import MIMEText from email.mime.base import MIMEBase from email import encoders

print(“🔧 CONFIGURATION SETUP\n”) print(“=”*80)

============================================

  • MongoDB Configuration
  • ============================================ print(“\n MongoDB Configuration”) print(” For local MongoDB: mongodb://localhost:27017/”) print(” For MongoDB Atlas: mongodb+srv://username:password@cluster.mongodb.net/\n”)

MONGODB_CONFIG = { “connection_string”: “mongodb://localhost:27017/”, # ⬅️ REPLACE if using Atlas “database”: “crm_database”, “collection”: “customers” }

print(f” ✅ Database: {MONGODB_CONFIG[‘database’]}”) print(f” ✅ Collection: {MONGODB_CONFIG[‘collection’]}”)

  • ============================================
  • Email Configuration
  • ============================================ print(“\n Email Configuration”) print(” For Gmail, create an app password at: https://myaccount.google.com/apppasswords\n“)

EMAIL_CONFIG = { “smtp_server”: “smtp.gmail.com”, “smtp_port”: 587, “sender_email”: “your-email@gmail.com”, # ⬅️ REPLACE “sender_password”: “your-app-password” # ⬅️ REPLACE (use app password, not regular password) }

print(f” ✅ SMTP Server: {EMAIL_CONFIG[‘smtp_server’]}:{EMAIL_CONFIG[‘smtp_port’]}”) print(f” ✅ Sender: {EMAIL_CONFIG[‘sender_email’]}”)

============================================

OpenAI Configuration (for agent orchestration)

print(“\n OpenAI Configuration”) print(” Get your API key at: https://platform.openai.com/api-keys\n“)

OPENAI_API_KEY = “your-openai-api-key-here” # ⬅️ REPLACE

print(f” ✅ API Key: {OPENAI_API_KEY[:20]}…”)

print(“\n” + “=”*80) print(“\n✅ Configuration complete!”) print(“\n💡 Pro Tip: In production, use environment variables instead of hardcoded values:”) print(” os.environ[‘UBIAI_TOKEN’], os.environ[‘OPENAI_API_KEY’], etc.”)

Step 4.2: Implement Function 1 – Natural Language → MongoDB Query

This is the star of the show, the function that uses your fine-tuned model!

What it does:

  1. Takes a natural language question
  2. Sends it to your fine-tuned UBIAI model
  3. Gets back a perfectly formatted MongoDB query
  4. Returns db query

Let’s implement it:

from langchain.tools import tool import requests import json from typing import Dict, Any

print(“\n Implementing Function 1: NL → Query Translation\n”) print(“This function will use your fine-tuned model to convert”) print(“natural language questions into MongoDB queries.\n”)

@tool from langchain.tools import tool import requests import json from typing import Dict, Any

@tool def translate_nl_to_query(natural_language_question: str) -> Dict[str, Any]: “””Translate a natural language question into a MongoDB query using a fine-tuned UBIAI model.”””

url = "https://api.ubiai.tools:8443/api_v1/annotate"
my_token = ""

user_prompt = f"""Translate the following natural language question into a MongoDB query.

The output should be structured and ready to execute in MongoDB.

Question: {natural_language_question} “””

data = {
    "input_text": "",
    "system_prompt": "You are an expert at converting natural language into MongoDB queries.",
    "user_prompt": user_prompt,
    "temperature": 0.0,
    "monitor_model": True,
    "knowledge_base_ids": [],
    "session_id": "",
    "images_urls": []
}

try:
    response = requests.post(url + my_token, json=data)
    if response.status_code == 200:
        res = response.json()
        return {"query": res.get("output")}
    else:
        return {"error": f"{response.status_code} - {response.text}"}
except Exception as e:
    return {"error": str(e)}

Step 4.3: Implement Function 2 – Execute MongoDB Query

Now that we can generate queries, we need to execute them on our database.

What this function does:

  1. Takes the query object from Function 1
  2. Connects to MongoDB
  3. Executes the query
  4. Returns the results

This is standard Python code, no AI needed here!

print(“\n🗄️ Implementing Function 2: Execute MongoDB Query\n”) print(“This function connects to MongoDB and executes the generated queries.\n”)

@tool def execute_mongodb_query( query_object: str, # Passed as JSON string for LangChain compatibility database_name: str = MONGODB_CONFIG[“database”], collection_name: str = MONGODB_CONFIG[“collection”] ) -> Dict[str, Any]: “”” Executes a MongoDB query and returns the results.

This function takes the query generated by Function 1 and runs it against
your MongoDB database. It handles all the connection logic, error cases,
and result formatting.

Args:
    query_object: JSON string containing query, projection, sort, and limit
        Example: '{"query": {"status": "active"}, "limit": 10}'

    database_name: MongoDB database name (defaults to config)
    collection_name: MongoDB collection name (defaults to config)

Returns:
    Dict containing:
    - success: Boolean indicating if query succeeded
    - results: List of documents matching the query
    - count: Number of results returned
    - total_matching: Total documents matching (before limit)
    - query_info: Metadata about the query execution

Example:
    >>> execute_mongodb_query('{"query": {"status": "active"}, "limit": 5}')
    {
        "success": True,
        "results": [{...}, {...}, ...],
        "count": 5,
        "total_matching": 247
    }
"""
print(f"\n🔌 Connecting to MongoDB...")
print(f"   Database: {database_name}")
print(f"   Collection: {collection_name}")

try:
    * Parse query object if it's a string
    if isinstance(query_object, str):
        query_params = json.loads(query_object)
    else:
        query_params = query_object

    * Extract query components
    query_filter = query_params.get("query", {})
    projection = query_params.get("projection", None)
    sort_criteria = query_params.get("sort", None)
    limit = query_params.get("limit", 100)

    print(f"\n   📊 Query Filter:")
    print(f"      {json.dumps(query_filter, indent=6)}")

    * Connect to MongoDB
    print(f"\n   ⚙️  Executing query...")
    client = pymongo.MongoClient(MONGODB_CONFIG["connection_string"])
    db = client[database_name]
    collection = db[collection_name]

    * Build and execute query
    cursor = collection.find(query_filter, projection)

    if sort_criteria:
        cursor = cursor.sort(list(sort_criteria.items()))

    if limit:
        cursor = cursor.limit(limit)

    * Get results
    results = list(cursor)

    * Convert ObjectId to string for JSON serialization
    for doc in results:
        if '_id' in doc:
            doc['_id'] = str(doc['_id'])

    * Get total count
    total_count = collection.count_documents(query_filter)

    client.close()

    print(f"    Query executed successfully!")
    print(f"    Results: {len(results)} returned (Total matching: {total_count})")

    return {
        "success": True,
        "results": results,
        "count": len(results),
        "total_matching": total_count,
        "query_info": {
            "database": database_name,
            "collection": collection_name,
            "filter": query_filter,
            "executed_at": datetime.now().isoformat()
        }
    }

except pymongo.errors.ConnectionFailure:
    print("   ❌ Failed to connect to MongoDB")
    return {
        "success": False,
        "error": "Failed to connect to MongoDB. Check connection string and ensure MongoDB is running.",
        "results": [],
        "count": 0
    }
except Exception as e:
    print(f"   ❌ Query failed: {str(e)}")
    return {
        "success": False,
        "error": f"Query execution failed: {str(e)}",
        "results": [],
        "count": 0
    }

print(” Function 2 implemented!”) print(“\n Note: To test this function, you need MongoDB running with sample data.”) print(” We’ll set up sample data in the next cell.”)

Step 4.4: Create Sample MongoDB Data (Optional)

If you don’t have a MongoDB database with CRM data, let’s create sample data for testing.

Skip this cell if you already have a database set up!

This will create 5 sample customers with realistic data:

def setup_sample_crm_database(): “”” Creates sample CRM data in MongoDB for testing.

This function will:
1. Connect to MongoDB
2. Clear any existing test data
3. Insert 5 realistic customer records

Run this once to populate test data.
"""
print("\n🗄️ Setting up sample CRM database...\n")

try:
    client = pymongo.MongoClient(MONGODB_CONFIG["connection_string"])
    db = client[MONGODB_CONFIG["database"]]
    collection = db[MONGODB_CONFIG["collection"]]

    * Clear existing data
    print("   🧹 Clearing existing data...")
    collection.delete_many({})

    * Create realistic sample customer data
    sample_customers = [
        {
            "customer_id": "CUST001",
            "name": "Alice Johnson",
            "email": "alice.johnson@example.com",
            "status": "active",
            "lifetime_value": 15000,
            "churn_risk_score": 8,  # High risk!
            "last_purchase_date": datetime(2024, 1, 15),
            "acquisition_channel": "organic_search",
            "segment": "enterprise",
            "created_at": datetime(2022, 3, 10)
        },
        {
            "customer_id": "CUST002",
            "name": "Bob Smith",
            "email": "bob.smith@example.com",
            "status": "active",
            "lifetime_value": 8500,
            "churn_risk_score": 3,  # Low risk
            "last_purchase_date": datetime(2024, 10, 20),
            "acquisition_channel": "paid_ads",
            "segment": "smb",
            "created_at": datetime(2023, 1, 5)
        },
        {
            "customer_id": "CUST003",
            "name": "Carol Davis",
            "email": "carol.davis@example.com",
            "status": "active",
            "lifetime_value": 22000,  # High value!
            "churn_risk_score": 9,    # High risk!
            "last_purchase_date": datetime(2023, 11, 5),
            "acquisition_channel": "referral",
            "segment": "enterprise",
            "created_at": datetime(2021, 6, 20)
        },
        {
            "customer_id": "CUST004",
            "name": "David Lee",
            "email": "david.lee@example.com",
            "status": "active",
            "lifetime_value": 4500,
            "churn_risk_score": 2,  # Low risk
            "last_purchase_date": datetime(2024, 10, 28),
            "acquisition_channel": "organic_search",
            "segment": "consumer",
            "created_at": datetime(2024, 8, 12)
        },
        {
            "customer_id": "CUST005",
            "name": "Emma Wilson",
            "email": "emma.wilson@example.com",
            "status": "inactive",
            "lifetime_value": 12000,
            "churn_risk_score": 10,  # Already churned
            "last_purchase_date": datetime(2023, 5, 15),
            "acquisition_channel": "paid_ads",
            "segment": "smb",
            "created_at": datetime(2022, 11, 3)
        }
    ]

    # Insert sample data
    print("    Inserting sample customer records...")
    result = collection.insert_many(sample_customers)

    print(f"\n   Success! Inserted {len(result.inserted_ids)} sample customers")
    print(f"\n    Database Info:")
    print(f"Database: {MONGODB_CONFIG['database']}")
    print(f"      • Collection: {MONGODB_CONFIG['collection']}")
    print(f"      • Total documents: {collection.count_documents({})}")

    print(f"\n   Sample Customers:")
    for customer in sample_customers:
        print(f"      • {customer['name']} - ${customer['lifetime_value']:,} LTV, Risk: {customer['churn_risk_score']}/10")

    client.close()
    return True

except Exception as e:
    print(f"\n    Failed to setup database: {str(e)}")
    print(f"\n    Make sure MongoDB is running:")
    print(f"Local: Start with 'mongod' command")
    print(f"      • Atlas: Check your connection string")
    return False

Uncomment the next line to create sample data

setup_sample_crm_database()

print(“\n To create sample data, uncomment the line above and run this cell.”)


Checkpoint: What We’ve Built So Far

Great progress! Let’s recap:

Function 1: Translates natural language → MongoDB queries (using fine-tuned model)
Function 2: Executes queries on MongoDB
Sample Data: Created realistic test customers (optional)

Next up: Functions 3 & 4, then we’ll wire everything together!


Step 4.5: Generate Executive Report

Now we have query results from MongoDB. Let’s turn them into professional business reports!

@tool def generate_executive_report( query_results: str, # JSON string of query results report_title: str, analysis_prompt: str = “Analyze these customer records and provide strategic insights”, filename: str = None ) -> Dict[str, Any]: “”” Generates a professional executive report from query results.

Args:
    query_results: JSON string containing query results
    report_title: Title for the report
    analysis_prompt: Specific analysis instructions
    filename: Output filename (auto-generated if not provided)

Returns:
    Dict containing report filename and summary
"""
print(f"\n📝 Generating executive report: '{report_title}'")

try:
    * Parse query results
    if isinstance(query_results, str):
        results_data = json.loads(query_results)
    else:
        results_data = query_results

    results = results_data.get("results", [])
    count = results_data.get("count", 0)

    * Generate insights using UBIAI API
    insights_prompt = f"""{analysis_prompt}

Data Summary:

  • Total Records: {count}
  • Sample Data: {json.dumps(results[:5], indent=2)}

Provide:

  1. Executive Summary (2-3 sentences)
  2. Key Findings (3-5 bullet points)
  3. Strategic Recommendations (3-5 action items)
  4. Risk Assessment
  5. Next Steps

Format as a professional business report.”””

    payload = {
        "input_text": "",
        "system_prompt": "You are a business intelligence analyst creating executive reports.",
        "user_prompt": insights_prompt,
        "temperature": 0.7,
        "monitor_model": True,
        "knowledge_base_ids": [],
        "session_id": "",
        "images_urls": []
    }

    response = requests.post(
        UBIAI_CONFIG["api_url"] + UBIAI_CONFIG["model_token"],
        json=payload,
        timeout=60
    )

    if response.status_code != 200:
        return {"error": f"Failed to generate insights: {response.status_code}"}

    insights = response.json().get("output", "No insights generated")

    * Create Word document
    if filename is None:
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
        filename = f"Executive_Report_{timestamp}.docx"

    doc = Document()

    * Title
    title = doc.add_heading(report_title, 0)
    title.alignment = WD_ALIGN_PARAGRAPH.CENTER

    * Metadata
    doc.add_paragraph(f"Generated: {datetime.now().strftime('%B %d, %Y at %I:%M %p')}")
    doc.add_paragraph(f"Total Records Analyzed: {count}")
    doc.add_paragraph()

    * Insights
    doc.add_heading('Analysis & Insights', 1)
    doc.add_paragraph(insights)

    * Data table
    doc.add_heading('Detailed Data', 1)

    if results:
        * Create table
        sample_size = min(10, len(results))
        keys = list(results[0].keys())

        table = doc.add_table(rows=1, cols=len(keys))
        table.style = 'Light Grid Accent 1'

        * Header row
        header_cells = table.rows[0].cells
        for i, key in enumerate(keys):
            header_cells[i].text = key.replace('_', ' ').title()

        # Data rows (first 10)
        for record in results[:sample_size]:
            row_cells = table.add_row().cells
            for i, key in enumerate(keys):
                value = record.get(key, '')
                row_cells[i].text = str(value)

        if len(results) > sample_size:
            doc.add_paragraph(f"\n(Showing {sample_size} of {count} total records)")

    * Footer
    doc.add_paragraph()
    footer = doc.add_paragraph("―" * 50)
    footer.add_run("\nGenerated by CRM Intelligence Agent | Powered by UBIAI")

    * Save document
    doc.save(filename)

    print(f"✅ Report generated: {filename}")

    return {
        "success": True,
        "filename": filename,
        "summary": f"Report contains {count} records with detailed analysis",
        "insights_preview": insights[:200] + "..."
    }

except Exception as e:
    return {
        "success": False,
        "error": f"Failed to generate report: {str(e)}"
    }

print(“\n✅ Function 3 (generate_executive_report) defined”)

Step 4.6: Send Report via Email

And finally we make a function that sends emails to the sales team.

@tool def send_report_email( recipient_emails: str, # Comma-separated email addresses report_filename: str, subject: str, message_body: str = None, cc_emails: str = None ) -> Dict[str, Any]: “”” Sends an executive report via email to specified recipients.

Args:
    recipient_emails: Comma-separated list of recipient email addresses
    report_filename: Path to the report file to attach
    subject: Email subject line
    message_body: Optional custom email body
    cc_emails: Optional comma-separated CC recipients

Returns:
    Dict containing delivery status and recipient info
"""
print(f"\n📧 Sending report via email...")

try:
    * Parse recipient emails
    recipients = [email.strip() for email in recipient_emails.split(',')]
    cc_list = [email.strip() for email in cc_emails.split(',')] if cc_emails else []

    * Create email message
    msg = MIMEMultipart()
    msg['From'] = EMAIL_CONFIG['sender_email']
    msg['To'] = ', '.join(recipients)
    if cc_list:
        msg['Cc'] = ', '.join(cc_list)
    msg['Subject'] = subject
    msg['Date'] = datetime.now().strftime('%a, %d %b %Y %H:%M:%S %z')

    * Email body
    if message_body is None:
        message_body = f"""Dear Team,

Please find attached the CRM Intelligence Report generated on {datetime.now().strftime(‘%B %d, %Y’)}.

This report contains critical insights from our customer database analysis. Please review the key findings and recommended actions.

Key highlights: • Data-driven insights from recent customer analysis • Strategic recommendations for customer retention • Risk assessment and mitigation strategies

For questions or further analysis, please contact the Business Intelligence team.

Best regards, CRM Intelligence Agent Powered by UBIAI

――――――――――――――――――――――――――――――― This is an automated report. Please do not reply to this email. “””

    msg.attach(MIMEText(message_body, 'plain'))

    # Attach report file
    try:
        with open(report_filename, 'rb') as attachment:
            part = MIMEBase('application', 'octet-stream')
            part.set_payload(attachment.read())
            encoders.encode_base64(part)
            part.add_header(
                'Content-Disposition',
                f'attachment; filename= {os.path.basename(report_filename)}'
            )
            msg.attach(part)
    except FileNotFoundError:
        return {
            "success": False,
            "error": f"Report file not found: {report_filename}"
        }

    * Send email
    server = smtplib.SMTP(EMAIL_CONFIG['smtp_server'], EMAIL_CONFIG['smtp_port'])
    server.starttls()
    server.login(EMAIL_CONFIG['sender_email'], EMAIL_CONFIG['sender_password'])

    all_recipients = recipients + cc_list
    server.sendmail(EMAIL_CONFIG['sender_email'], all_recipients, msg.as_string())
    server.quit()

    print(f"✅ Email sent successfully to {len(recipients)} recipient(s)")

    return {
        "success": True,
        "recipients": recipients,
        "cc": cc_list,
        "subject": subject,
        "attachment": os.path.basename(report_filename),
        "sent_at": datetime.now().isoformat()
    }

except smtplib.SMTPAuthenticationError:
    return {
        "success": False,
        "error": "Email authentication failed. Check sender_email and sender_password in EMAIL_CONFIG."
    }
except Exception as e:
    return {
        "success": False,
        "error": f"Failed to send email: {str(e)}"
    }

print(“\n✅ Function 4 (send_report_email) defined”)

Step 5: Initialize the Complete Agent

All that is left to do is put all the agent components together

from langchain_openai import ChatOpenAI from langchain.agents import AgentType, initialize_agent from langchain.prompts.chat import ChatPromptTemplate, MessagesPlaceholder from langchain.memory import ConversationBufferMemory

Initialize memory

memory = ConversationBufferMemory(memory_key=”chat_history”, return_messages=True)

Define all tools

tools = [ translate_nl_to_query, execute_mongodb_query, generate_executive_report, send_report_email ]

  • Initialize LLM (using GPT-4 for agent orchestration) llm = ChatOpenAI( model=”gpt-4″, openai_api_key=”your-openai-api-key-here”, * Update this temperature=0.3 )

  • Create system prompt system_prompt = “””You are a CRM Intelligence Agent specializing in database analysis and reporting.

Your workflow:

  1. Use translate_nl_to_query() to convert natural language questions into MongoDB queries
  2. Use execute_mongodb_query() to run the query and get results
  3. Use generate_executive_report() to create professional reports from the data
  4. Use send_report_email() to deliver reports to stakeholders

Always:

  • Confirm actions before executing irreversible operations (sending emails)
  • Provide clear summaries at each step
  • Handle errors gracefully and suggest alternatives
  • Use the fine-tuned query translation model for accurate database queries “””

  • Create prompt template prompt = ChatPromptTemplate.from_messages([ (“system”, system_prompt), MessagesPlaceholder(variable_name=”chat_history”), (“human”, “{input}”), MessagesPlaceholder(variable_name=”agent_scratchpad”), ])

  • Initialize agent agent = initialize_agent( tools=tools, llm=llm, agent=AgentType.OPENAI_FUNCTIONS, verbose=True, memory=memory, agent_kwargs={

      "prompt": prompt
    

    } )

print(“\n✅ CRM Intelligence Agent initialized successfully!”) print(“\n🤖 Available capabilities:”) print(” 1. Natural language → MongoDB query translation (fine-tuned)”) print(” 2. Database query execution”) print(” 3. Executive report generation”) print(” 4. Automated email delivery”)


Step 6: Practical Demonstration

Let’s see the complete agent in action with a real scenario!

Churn Risk Analysis

print(“\n” + “=”80) print(“SCENARIO 1: HIGH-VALUE CHURN RISK ANALYSIS”) print(“=”80 + “\n”)

query_1 = “”” I need to identify all customers who:

  • Have a lifetime value greater than $10,000
  • Have a churn risk score of 7 or higher
  • Are currently active

Please create a report titled ‘High-Value Churn Risk Analysis’ and send it to sales-team@company.com and customer-success@company.com “””

try: response_1 = agent({“input”: query_1}) print(“\n Agent Response:”) print(response_1[“output”]) except Exception as e: print(f”\n❌ Error: {str(e)}”) print(“\nNote: Ensure MongoDB is running and email credentials are configured.”)

Performance Comparison: Generic vs Fine-Tuned Agent

Generic Agent Results:

❌ Query Accuracy: ~60% ❌ Schema Understanding: Poor (often uses wrong field names) ❌ Query Optimization: Suboptimal (slow queries) ❌ Business Logic: Doesn’t understand company-specific rules ❌ Hallucination Rate: High (makes up collection names)

Final Thoughts

Generic AI agents are powerful, but fine-tuned agents are game-changers for business applications. By embedding domain knowledge directly into the model, you create tools that:

  • Actually understand your business
  • Provide reliable, consistent results
  • Cost less to operate
  • Empower non-technical users

The future of AI in enterprise isn’t just about bigger models, it’s about smarter, specialized agents that truly align with your business goals.


🙏 Thank You!

If you found this tutorial helpful:

Questions? Please reach out at admin@ubiai.tools!


Unlocking the Power of SLM Distillation for Higher Accuracy and Lower Cost​

How to make smaller models as intelligent as larger ones

Recording Date : March 7th, 2025

Unlock the True Potential of LLMs !

Harnessing AI Agents for Advanced Fraud Detection

How AI Agents Are Revolutionizing Fraud Detection

Recording Date : February 13th, 2025

Unlock the True Potential of LLMs !

Thank you for registering!

Check your email for the live demo details

see you on February 19th

While you’re here, discover how you can use UbiAI to fine-tune highly accurate and reliable AI models!

Thank you for registering!

Check your email for webinar details

see you on March 5th

While you’re here, discover how you can use UbiAI to fine-tune highly accurate and reliable AI models!

Fine Tuning LLMs on Your Own Dataset ​

Fine-Tuning Strategies and Practical Applications

Recording Date : January 15th, 2025

Unlock the True Potential of LLMs !