AI Chatbots & AutomationFeatured Project

AI Customer Retention with Multi-Agent Chat Architecture

Client

Customer Experience Platform

Industry

Technology (IT)

Timeline

3 weeks

Type

AI Chatbots & Automation

Overview

The team needed a reliable way to handle cancellation conversations without losing context, sounding robotic, or giving inconsistent policy answers. In a 3-week build, we delivered an AI customer retention automation system using a multi-agent AI chatbot architecture that authenticates users, routes intent, generates retention offers from rules, and finalizes outcomes through a single API flow. The result was a technically credible, reusable foundation for a RAG customer support system that can support growth without increasing support complexity at the same pace.

Challenge

The organisation had no stable retention workflow, no production-ready support orchestration, and limited bandwidth to manually handle cancellation-heavy conversations.

Key risks before implementation:

Cancellation chats were hard to standardize, so response quality could drift.
Policy details could be missed or paraphrased inconsistently without grounded retrieval.
Routing technical and billing intents from the same conversation path created operational friction.
Without structured logging and final-action tracking, there was no dependable audit trail.

If unresolved, this would increase churn risk, reduce trust in support interactions, and make scale harder as ticket volume grows.

Solution

We implemented a modular multi-agent AI chatbot architecture with explicit state transitions and tool-backed actions.

Core design decisions:

LangGraph was used for deterministic agent routing and state management.
A three-agent model separated concerns: Greeter for authentication and intent classification, Retention for offer strategy and policy-aware responses, Processor for final action execution and closure
A rule-driven offer engine generated tier-aware alternatives instead of hardcoded static scripts.
A RAG customer support system retrieved relevant policy context from indexed internal documents.
FastAPI exposed a session-aware chat endpoint for new and continuing conversations.

Tradeoff: we optimized for correctness, maintainability, and extensibility over flashy UI scope, since the business priority was reliable retention logic and technical proof of work.

Key Features

Email-based customer lookup and profile-aware conversation context
Intent routing across cancellation, technical support, and billing pathways
Tier- and reason-based retention offer calculation from configurable rules
Policy-grounded responses through semantic retrieval from internal documents
Session-based multi-turn conversations via API
Structured status updates and logs for downstream review and auditing

Technical Implementation

Backend & Infrastructure

The service is built in Python with FastAPI and an async request flow. Conversation state is passed through a compiled LangGraph graph, with conditional edges deciding handoff between greeter, retention, and processor nodes. Session continuity is handled through session IDs so users can continue the same thread across requests.

Data & AI Components

The system uses closed-source LLMs for agent reasoning and closed-source embeddings for retrieval. Policy documents are chunked and stored in ChromaDB, then queried at runtime when retention or policy clarification is needed. This keeps responses grounded and reduces unsupported statements.

Retention logic is implemented as a tool-driven rules layer:

Customer tier and cancellation reason are mapped to eligible offer strategies.
Fallback logic ensures the system still responds coherently when exact matches are unavailable.
Final actions are written to a structured update log for traceability.

Frontend & User Experience

This delivery focused on API-first support orchestration and CLI/API interaction rather than a custom visual frontend. The UX value came from conversation quality: less repetitive questioning, clearer transitions between support paths, and more contextual offer messaging.

Security & Reliability

Configuration is environment-driven with Pydantic settings. Health checks surface API, RAG, and model readiness. Optional tracing via Langfuse supports observability for debugging and quality monitoring. Tests cover core agent routing, tool logic, and RAG behavior to protect regression-sensitive paths.

Results

This project was intentionally delivered as a credibility-first technical implementation with qualitative outcomes.

More consistent cancellation handling through explicit agent orchestration
Better policy accuracy through retrieval-grounded responses
Cleaner separation of concerns across intake, retention, and processing logic
Stronger engineering baseline for future production hardening

Technology Stack

AI/ML: LangGraph, LangChain, Closed-source Chat Models, Closed-source Embeddings
Backend: Python, FastAPI, Pydantic, ChromaDB
Frontend: Rich
Infrastructure: Uvicorn, Langfuse

Interested in Similar Results?

Let's discuss how we can craft a custom solution for your business challenges.

Start a Conversation View More Projects