# 📋 Community Scribe & Assistant ### Elevator Pitch The **Community Scribe & Assistant** is a Discord-integrated AI agent that listens in on voice channels, transcribes conversations, and autonomously extracts action items. After each meeting, the assistant summarizes discussions and proposes relevant follow-ups—like creating new channels, scheduling events, or pinging responsible parties—either performing them directly or queuing them for human approval. This tool transforms casual coordination into structured execution without disrupting real-time collaboration. --- ## ✨ Key Features - ✅ **Voice-to-Text Transcription** - Joins Discord voice channels and captures user audio - Converts speech to text via OpenAI Whisper (or similar) - ✍️ **Speaker-Labeled Transcript Logging** - Stores timestamped, speaker-separated transcripts - Accessible via file export or channel posting - 🧠 **AI-Driven Meeting Summarization** - Generates a natural-language summary after the meeting - Extracts decisions, themes, and takeaways - 🪄 **Action Item Detection & Agentic Suggestions** - Uses LLMs to identify tasks and opportunities - Produces structured actions (create channel, schedule event, notify user) - Marks actions as auto-executable or needing review - 🤖 **n8n Automation Integration** - Trigger post-processing workflows in n8n - Automatically perform Discord API actions (via bots) - Supports approval steps for human-in-the-loop control --- ## 🔁 Example Workflow 1. Meeting happens in a Discord voice channel 2. Bot records and transcribes audio in real time (buffered chunks) 3. After the call: - Transcript is compiled - LLM summarizes key points - Tasks are extracted and structured 4. n8n automation: - Auto-performs safe tasks - Queues sensitive ones for approval 5. Report posted to summary channel or DM’d to moderator Absolutely. Here’s a clean example of the MVP flow and tech stack for the simpler full-recording approach to your Community Scribe & Assistant project. --- ✅ MVP Flow 1 : Post-Meeting Transcript Automation 🧠 Overview After a Discord voice chat, the bot saves the full meeting audio, transcribes it using Whisper, then sends the transcript to an AI agent to extract a summary and action items, and finally triggers a workflow (via n8n or similar) to act on those. --- 🧭 Step-by-Step Flow 1. Bot joins a voice channel 2. Records entire meeting to a file (e.g., .ogg or .wav) 3. After call ends: a. Transcribe file using Whisper (local or API) b. Create structured transcript JSON c. Send JSON to n8n (via webhook, file upload, or poll trigger) 4. n8n: a. Summarizes transcript using GPT-4 or Claude b. Extracts action items c. Auto-executes or queues follow-up actions (e.g., create channel, ping user) 5. Post transcript + summary + action plan in Discord --- ⚙️ MVP Tech Stack Component Tech 🎧 Audio Capture @discordjs/voice (Node.js) 🎙️ Audio Format Record in .ogg or .pcm, optionally pipe to .wav 📼 Storage Local disk or S3/GCS (optional temp storage) 📝 Transcription whisper.cpp, faster-whisper, or OpenAI Whisper API 🧠 AI Summary + Actions GPT-4 via OpenAI API or Claude via API 🔁 Automation Orchestration n8n (hosted or local) 📬 Output Discord Bot (via Discord.js HTTP) --- 🧱 Example File Structure (after a meeting) /recordings/ meeting-2025-06-27.ogg /transcripts/ meeting-2025-06-27.json meeting-2025-06-27-summary.md 📝 Example Transcript JSON { "meeting_id": "abc123", "duration": "22m", "participants": ["alice", "bob", "carol"], "transcript": [ { "speaker": "alice", "text": "We should create a design-review channel." }, { "speaker": "bob", "text": "I'll schedule the sync." } ] } --- 🔁 Example n8n Workflow (Simplified) 1. Trigger: Webhook or File Watcher when transcript JSON is added 2. LLM Summary Node: OpenAI or Claude with prompt: “Summarize the following meeting transcript and list any actionable items.” 3. Parse JSON Output (summary + actions) 4. Conditional Logic: Auto-perform: Call Discord API via HTTP node Human approval: DM moderator or post card with buttons 5. Post summary to summary channel --- ✅ Advantages of This Stack Very fast to build and iterate on No concurrency or real-time edge cases Easy to test, debug, and scale Doesn’t require full streaming infrastructure or custom voice encoding --- ## 🛠️ MVP Stack 2 | Component | Tech | |----------|------| | Voice Connection | `@discordjs/voice` | | Audio Decoding | `ffmpeg-static`, `@discordjs/opus` | | Transcription | OpenAI Whisper (API or local) | | AI Agent | OpenAI GPT-4 / Claude / Local LLM | | Workflow Engine | `n8n` (hosted or self-hosted) | | Output | Discord Bot (channel post or DM) | ![1000005053](https://hackmd.io/_uploads/BkYNHwaExe.png) --- poc ai generated summary from the meeting we talked about this (this doc or process could be used to flesh out poc) https://hackmd.io/@Dekan/HJ1W2S3Vgx --- ## 🧪 MVP vs. Real-Time Transcription | Feature | MVP (Post-Meeting) | Real-Time | |--------|--------------------|------------| | Complexity | 🟢 Low | 🔴 High | | Accuracy | 🟢 High (full context) | 🟡 Lower (chunked) | | Cost | 🟢 Lower (batched compute) | 🔴 Higher (streaming LLMs) | | Latency | 🔴 Delayed summary | 🟢 Instant replies | | Flexibility | 🟢 Easy to build & extend | 🔴 Harder to debug | | Consent | 🟢 Clear & logged | 🟡 Must manage dynamically | --- ## ✅ Benefits of MVP Approach - **Faster to prototype** — No need for low-latency streaming - **Simplifies privacy** — Users opt-in at meeting start - **Enables powerful AI** — Full transcript allows better summarization + planning - **More resilient** — No pressure to stay online or respond instantly - **Extensible** — Easy to layer in automation, UI approval, or task tracking --- ## 🔮 Future Extensions - 🎤 Real-time “Scribe is listening…” assistant - 🧾 PDF or Notion export of transcripts - 📅 Native calendar integration - 🧩 Plugin framework for new task types - 🗂️ Knowledge base / Q&A integration (“What did we decide last Tuesday?”) ---