# 📋 Community Scribe & Assistant
### Elevator Pitch
The **Community Scribe & Assistant** is a Discord-integrated AI agent that listens in on voice channels, transcribes conversations, and autonomously extracts action items. After each meeting, the assistant summarizes discussions and proposes relevant follow-ups—like creating new channels, scheduling events, or pinging responsible parties—either performing them directly or queuing them for human approval.
This tool transforms casual coordination into structured execution without disrupting real-time collaboration.
---
## ✨ Key Features
- ✅ **Voice-to-Text Transcription**
- Joins Discord voice channels and captures user audio
- Converts speech to text via OpenAI Whisper (or similar)
- ✍️ **Speaker-Labeled Transcript Logging**
- Stores timestamped, speaker-separated transcripts
- Accessible via file export or channel posting
- 🧠 **AI-Driven Meeting Summarization**
- Generates a natural-language summary after the meeting
- Extracts decisions, themes, and takeaways
- 🪄 **Action Item Detection & Agentic Suggestions**
- Uses LLMs to identify tasks and opportunities
- Produces structured actions (create channel, schedule event, notify user)
- Marks actions as auto-executable or needing review
- 🤖 **n8n Automation Integration**
- Trigger post-processing workflows in n8n
- Automatically perform Discord API actions (via bots)
- Supports approval steps for human-in-the-loop control
---
## 🔁 Example Workflow
1. Meeting happens in a Discord voice channel
2. Bot records and transcribes audio in real time (buffered chunks)
3. After the call:
- Transcript is compiled
- LLM summarizes key points
- Tasks are extracted and structured
4. n8n automation:
- Auto-performs safe tasks
- Queues sensitive ones for approval
5. Report posted to summary channel or DM’d to moderator
Absolutely. Here’s a clean example of the MVP flow and tech stack for the simpler full-recording approach to your Community Scribe & Assistant project.
---
✅ MVP Flow 1 : Post-Meeting Transcript Automation
🧠 Overview
After a Discord voice chat, the bot saves the full meeting audio, transcribes it using Whisper, then sends the transcript to an AI agent to extract a summary and action items, and finally triggers a workflow (via n8n or similar) to act on those.
---
🧭 Step-by-Step Flow
1. Bot joins a voice channel
2. Records entire meeting to a file (e.g., .ogg or .wav)
3. After call ends:
a. Transcribe file using Whisper (local or API)
b. Create structured transcript JSON
c. Send JSON to n8n (via webhook, file upload, or poll trigger)
4. n8n:
a. Summarizes transcript using GPT-4 or Claude
b. Extracts action items
c. Auto-executes or queues follow-up actions (e.g., create channel, ping user)
5. Post transcript + summary + action plan in Discord
---
⚙️ MVP Tech Stack
Component Tech
🎧 Audio Capture @discordjs/voice (Node.js)
🎙️ Audio Format Record in .ogg or .pcm, optionally pipe to .wav
📼 Storage Local disk or S3/GCS (optional temp storage)
📝 Transcription whisper.cpp, faster-whisper, or OpenAI Whisper API
🧠 AI Summary + Actions GPT-4 via OpenAI API or Claude via API
🔁 Automation Orchestration n8n (hosted or local)
📬 Output Discord Bot (via Discord.js HTTP)
---
🧱 Example File Structure (after a meeting)
/recordings/
meeting-2025-06-27.ogg
/transcripts/
meeting-2025-06-27.json
meeting-2025-06-27-summary.md
📝 Example Transcript JSON
{
"meeting_id": "abc123",
"duration": "22m",
"participants": ["alice", "bob", "carol"],
"transcript": [
{ "speaker": "alice", "text": "We should create a design-review channel." },
{ "speaker": "bob", "text": "I'll schedule the sync." }
]
}
---
🔁 Example n8n Workflow (Simplified)
1. Trigger: Webhook or File Watcher when transcript JSON is added
2. LLM Summary Node: OpenAI or Claude with prompt:
“Summarize the following meeting transcript and list any actionable items.”
3. Parse JSON Output (summary + actions)
4. Conditional Logic:
Auto-perform: Call Discord API via HTTP node
Human approval: DM moderator or post card with buttons
5. Post summary to summary channel
---
✅ Advantages of This Stack
Very fast to build and iterate on
No concurrency or real-time edge cases
Easy to test, debug, and scale
Doesn’t require full streaming infrastructure or custom voice encoding
---
## 🛠️ MVP Stack 2
| Component | Tech |
|----------|------|
| Voice Connection | `@discordjs/voice` |
| Audio Decoding | `ffmpeg-static`, `@discordjs/opus` |
| Transcription | OpenAI Whisper (API or local) |
| AI Agent | OpenAI GPT-4 / Claude / Local LLM |
| Workflow Engine | `n8n` (hosted or self-hosted) |
| Output | Discord Bot (channel post or DM) |

---
poc ai generated summary from the meeting we talked about this (this doc or process could be used to flesh out poc) https://hackmd.io/@Dekan/HJ1W2S3Vgx
---
## 🧪 MVP vs. Real-Time Transcription
| Feature | MVP (Post-Meeting) | Real-Time |
|--------|--------------------|------------|
| Complexity | 🟢 Low | 🔴 High |
| Accuracy | 🟢 High (full context) | 🟡 Lower (chunked) |
| Cost | 🟢 Lower (batched compute) | 🔴 Higher (streaming LLMs) |
| Latency | 🔴 Delayed summary | 🟢 Instant replies |
| Flexibility | 🟢 Easy to build & extend | 🔴 Harder to debug |
| Consent | 🟢 Clear & logged | 🟡 Must manage dynamically |
---
## ✅ Benefits of MVP Approach
- **Faster to prototype** — No need for low-latency streaming
- **Simplifies privacy** — Users opt-in at meeting start
- **Enables powerful AI** — Full transcript allows better summarization + planning
- **More resilient** — No pressure to stay online or respond instantly
- **Extensible** — Easy to layer in automation, UI approval, or task tracking
---
## 🔮 Future Extensions
- 🎤 Real-time “Scribe is listening…” assistant
- 🧾 PDF or Notion export of transcripts
- 📅 Native calendar integration
- 🧩 Plugin framework for new task types
- 🗂️ Knowledge base / Q&A integration (“What did we decide last Tuesday?”)
---