# Conversation Insight Extraction & Summarization
**Created at** 2025-10-30
**Created by** Manoj Vala
### (Integrated with `/finalize` Celery Task)
---
## Objective
To automatically analyze and summarize customer–agent conversations when the `/finalize` endpoint is called, extending the existing summarization logic to include **conversation insight extraction** such as topic relevance, unanswered questions, miscommunication, and goal achievement — **This will help many resource not to load again from database as well as from Azure blob**.
This ensures that each call automatically generates a complete structured report that can later be displayed in the **Agent Post-Action section**.
---
## Current System Overview
When an agent finishes a conversation:
1. The backend triggers the `/finalize` API endpoint.
2. A **Celery task** (`summarize_conversation_task`) is enqueued.
3. That Celery worker currently performs:
* Conversation summarization using GPT
* Report generation (stored in Azure Blob or DB)
Now, this Celery task will be extended to also perform **deep conversation analysis** using the same GPT request.
---
## Updated Workflow
### Step 1: `/finalize` Endpoint Trigger
**Endpoint:**
`POST /finalize`
**Behavior:**
* Receives `conversation_id`, `tenant_id`, and related metadata.
* Pushes a Celery job into the queue:
```python
finalize_conversation.delay(conversation_id, tenant_id, topic)
```
---
### Step 2: Celery Task Execution
**Task Name:**
`finalize_conversation`
**Responsibilities:**
1. Load conversation data from Azure Blob or DB.
2. Format conversation messages and metadata.
3. Prepare the prompt for GPT analysis.
4. Invoke GPT for **summarization + analysis**.
5. Store the structured results (in DB or Blob).
6. Log and handle any exceptions.
Example:
```
You are an expert conversation analyst.
Analyze the following conversation between a 'customer' and an 'agent'.
Your response MUST be a single valid JSON with these fields:
1. "summary": (string)
2. "appointment_requested": (boolean)
3. "phone_number": (string or null)
4. "first_name": (string or null)
5. "last_name": (string or null)
6. "unanswered_question": (string or null)
7. "achieved_goal": (string) describe if the customer’s goal was achieved, with a reason.
8. "mis-communication": (string) describe if something was misunderstood or ignored.
9. "topic_relevance_reason": (string) analyze if the customer input relates to topic '{topic}', with explanation.
10. "outcome": (string) success | failure | neutral
Conversation Transcript:
{formatted_conversation}
```
---
### Step 3: GPT-Based Unified Summarization & Analysis
This GPT request replaces the old “summary-only” flow.
Now it produces a **complete insight object** that includes all key conversation metrics and topic-level reasoning.
---
### Step 4: Data Storage
The structured JSON is stored along with the conversation metadata, for example:
```json
{
"conversation_id": "d52cd31d-cdd5-4343-a03e-9e5f01de3ed1",
"topic": "Appointment Booking",
"summary": "The customer requested a booking for Friday and confirmed timing.",
"appointment_requested": true,
"achieved_goal": "Yes, appointment confirmed successfully.",
"mis-communication": "Agent ignored a query about location.",
"topic_relevance_reason": "All exchanges were directly related to appointment scheduling.",
"outcome": "success",
"created_at": "2025-10-28T13:45:00Z"
}
```
---
### Step 5: Display in Agent Post-Action Section
In the **Agent > Post Action** page:
* The system queries all analyzed conversations associated with that agent.
* Each conversation displays:
* Summary
* Achieved Goal (Yes/No + Reason)
* Miscommunication Reason
* Topic Relevance Reason
* Outcome
This provides agents and supervisors with direct insight into conversation quality and relevance.
---
## Logging & Observability
Comprehensive logs ensure debugging and traceability:
```
[INFO] Celery task triggered for conversation finalize
[INFO] Loading conversation d52cd31d-cdd5-4343-a03e-9e5f01de3ed1
[INFO] Sending conversation to GPT for summarization & analysis
[INFO] GPT response received and parsed successfully
[INFO] Analysis stored: outcome=success, topic_relevance=high
[INFO] Task completed for conversation d52cd31d-cdd5-4343-a03e-9e5f01de3ed1
```
---
## Benefits
* Seamless integration — No new endpoint needed.
* Automatic trigger on `/finalize` keeps flow event-driven.
* GPT handles both **summarization** and **deep insight extraction**.
* Reports are readily available in the **Agent Post-Action section**.
* Fully observable via structured logs.
---