Conversation Insight Extraction & Summarization

# Conversation Insight Extraction & Summarization **Created at** 2025-10-30 **Created by** Manoj Vala ### (Integrated with `/finalize` Celery Task) --- ## Objective To automatically analyze and summarize customer–agent conversations when the `/finalize` endpoint is called, extending the existing summarization logic to include **conversation insight extraction** such as topic relevance, unanswered questions, miscommunication, and goal achievement — **This will help many resource not to load again from database as well as from Azure blob**. This ensures that each call automatically generates a complete structured report that can later be displayed in the **Agent Post-Action section**. --- ## Current System Overview When an agent finishes a conversation: 1. The backend triggers the `/finalize` API endpoint. 2. A **Celery task** (`summarize_conversation_task`) is enqueued. 3. That Celery worker currently performs: * Conversation summarization using GPT * Report generation (stored in Azure Blob or DB) Now, this Celery task will be extended to also perform **deep conversation analysis** using the same GPT request. --- ## Updated Workflow ### Step 1: `/finalize` Endpoint Trigger **Endpoint:** `POST /finalize` **Behavior:** * Receives `conversation_id`, `tenant_id`, and related metadata. * Pushes a Celery job into the queue: ```python finalize_conversation.delay(conversation_id, tenant_id, topic) ``` --- ### Step 2: Celery Task Execution **Task Name:** `finalize_conversation` **Responsibilities:** 1. Load conversation data from Azure Blob or DB. 2. Format conversation messages and metadata. 3. Prepare the prompt for GPT analysis. 4. Invoke GPT for **summarization + analysis**. 5. Store the structured results (in DB or Blob). 6. Log and handle any exceptions. Example: ``` You are an expert conversation analyst. Analyze the following conversation between a 'customer' and an 'agent'. Your response MUST be a single valid JSON with these fields: 1. "summary": (string) 2. "appointment_requested": (boolean) 3. "phone_number": (string or null) 4. "first_name": (string or null) 5. "last_name": (string or null) 6. "unanswered_question": (string or null) 7. "achieved_goal": (string) describe if the customer’s goal was achieved, with a reason. 8. "mis-communication": (string) describe if something was misunderstood or ignored. 9. "topic_relevance_reason": (string) analyze if the customer input relates to topic '{topic}', with explanation. 10. "outcome": (string) success | failure | neutral Conversation Transcript: {formatted_conversation} ``` --- ### Step 3: GPT-Based Unified Summarization & Analysis This GPT request replaces the old “summary-only” flow. Now it produces a **complete insight object** that includes all key conversation metrics and topic-level reasoning. --- ### Step 4: Data Storage The structured JSON is stored along with the conversation metadata, for example: ```json { "conversation_id": "d52cd31d-cdd5-4343-a03e-9e5f01de3ed1", "topic": "Appointment Booking", "summary": "The customer requested a booking for Friday and confirmed timing.", "appointment_requested": true, "achieved_goal": "Yes, appointment confirmed successfully.", "mis-communication": "Agent ignored a query about location.", "topic_relevance_reason": "All exchanges were directly related to appointment scheduling.", "outcome": "success", "created_at": "2025-10-28T13:45:00Z" } ``` --- ### Step 5: Display in Agent Post-Action Section In the **Agent > Post Action** page: * The system queries all analyzed conversations associated with that agent. * Each conversation displays: * Summary * Achieved Goal (Yes/No + Reason) * Miscommunication Reason * Topic Relevance Reason * Outcome This provides agents and supervisors with direct insight into conversation quality and relevance. --- ## Logging & Observability Comprehensive logs ensure debugging and traceability: ``` [INFO] Celery task triggered for conversation finalize [INFO] Loading conversation d52cd31d-cdd5-4343-a03e-9e5f01de3ed1 [INFO] Sending conversation to GPT for summarization & analysis [INFO] GPT response received and parsed successfully [INFO] Analysis stored: outcome=success, topic_relevance=high [INFO] Task completed for conversation d52cd31d-cdd5-4343-a03e-9e5f01de3ed1 ``` --- ## Benefits * Seamless integration — No new endpoint needed. * Automatic trigger on `/finalize` keeps flow event-driven. * GPT handles both **summarization** and **deep insight extraction**. * Reports are readily available in the **Agent Post-Action section**. * Fully observable via structured logs. ---