H20429蕭葦倫
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
      • Invitee
    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Versions and GitHub Sync Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
Invitee
Publish Note

Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

Your note will be visible on your profile and discoverable by anyone.
Your note is now live.
This note is visible on your profile and discoverable online.
Everyone on the web can find and read all notes of this public team.
See published notes
Unpublish note
Please check the box to agree to the Community Guidelines.
View profile
Engagement control
Commenting
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
  • Everyone
Suggest edit
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
Emoji Reply
Enable
Import from Dropbox Google Drive Gist Clipboard
   owned this note    owned this note      
Published Linked with GitHub
Subscribed
  • Any changes
    Be notified of any changes
  • Mention me
    Be notified of mention me
  • Unsubscribe
Subscribe
## Applicant Information - **Name:** Wei Lun Hsiao - **Affiliation:** National Taiwan University, Department of Computer Science and Information Engineering - **Email:** b13902049@csie.ntu.edu.tw - **GitHub:** [AlenHsiaoWeiLun](https://github.com/AlenHsiaoWeiLun/gsoc_music_agent) - **Location:** Taipei, Taiwan (GMT+8) --- ## 1. Motivation & Vision ### Bridging Emotion and Code I’ve always believed that technology should not just be functional—it should *feel*. As a computer science student at National Taiwan University, my journey began with algorithms and abstractions. But over time, I realized that what drives me isn’t just solving problems—it’s building systems that understand people. Music, for me, has always been more than background noise. It’s an emotional language—one that heals, energizes, and connects. That’s why this project resonates deeply. A music agent that listens, understands, and adapts to your emotional state? That’s not just technically exciting—it’s human. ### Long-Term Vision This assistant isn’t just about music. It’s a prototype for a new kind of AI—one that’s emotionally aware, conversationally fluent, and designed for humans, not just power users. In the long run, I plan to evolve this project into both a **research publication** and a **product platform**: - As research, I aim to study *trajectory-based music recommendation*, *emotion-conditioned semantic reranking*, and *conversational feedback loops*, and turn these insights into a paper on human-centered AI. - As a product, I envision integrating **real-time biofeedback** via wearable sensors (e.g., heart rate, breathing, GSR) to inform mood inference and dynamically adjust playlists. Imagine calming playlists that **respond to your anxiety levels**, or energizing transitions during workouts—emotion-driven, real-time, and deeply personal. This is not just code. It’s **an interface between emotion and interaction**—a living agent that learns how to support you, musically and emotionally. --- ## 2. Technical Feasibility & System Design ### Proof-of-Concept Demonstrated To validate the technical feasibility and reduce implementation risk, I have already developed several working proof-of-concept (PoC) components: - ✅ **LLM-powered CLI dialogue agent** (Module 5): Uses the locally hosted Mistral-7B-Instruct model to interpret user moods and generate empathetic responses. - ✅ **Spotify OAuth2 Token Manager** (Module 0): Fully functional token flow, including local `token.json` caching and automatic token refresh. - ✅ **Fallback playlist engine** (Module 3): Returns mood-based track recommendations when Spotify's `/v1/recommendations` fails. - ✅ **Feedback-aware reranker** (Module 4): Uses liked/disliked track history to re-score and reorder track recommendations dynamically. - ✅ **Trajectory-based playlist generator** (Module 2): Builds multi-phase playlists that reflect emotional journeys (e.g., tired → energized), supporting dynamic filtering and audio feature interpolation. These modules were tested on a local Debian workstation with real-time Spotify API interactions, offline LLM inference, and persistent user preference storage. ### System Architecture SymMuse is composed of several key modules: 1. **Conversational Frontend Interface**: Web-based UI (React + Tailwind) styled as a chatbot for playlist interaction. 2. **NLP Emotion & Intent Understanding**: Lightweight sentiment and intent detection model using DistilBERT or LLM. 3. **Personalized Playlist Generator**: Emotion-driven generation using Spotify’s audio features and feedback refinement. 4. **Spotify API Integration Layer**: Secure OAuth2 token handling and playlist management. 5. **Privacy-Friendly Exploratory Mode**: Offline playlist recommendations using public datasets. 6. **Feedback & Refinement System**: Enables interactive playlist edits (e.g., “more upbeat”). 7. **Backend API Service**: RESTful interface for frontend, AI inference, and Spotify sync. 8. **Documentation & Deployment Tools**: For future contributors and deployment readiness. 9. **Proof-of-Concept Demo**: Video walkthrough to showcase system capabilities. --- ## 3. Innovation and Originality SymMuse introduces a novel fusion of real-time speech-based emotion detection, semantic parsing, and intelligent music curation. Unlike existing platforms that rely solely on user history or static preferences, SymMuse responds to transient emotional cues and adapts in the moment. Key innovations: - **Emotion trajectory modeling** based on speech and conversation - **Explainable recommendations** via LLM-generated feedback - **Edge deployability** with privacy-preserving architecture This is not a repackaged toolchain—it is a ground-up system tailored to enhance digital empathy. --- ## 4. Expected Impact ### Commercial Potential - Add-on module for streaming platforms (Spotify, KKBOX) - Integration with voice assistants or wearables (e.g., Apple Watch) - Emotion-aware BGM for automotive, meditation, or mental health apps ### Social Value - Promotes emotional self-awareness and resilience - Provides gentle, ambient mental health support - Sparks interdisciplinary innovation between music, psychology, and AI SymMuse reflects a future where technology doesn’t just compute, but listens and feels. --- ## 5. Development Plan ### Current Progress - SER model trained and deployed locally - NLP module integrated with Mistral-7B - CLI-based prototype generates real Spotify playlists ### Next Steps - Front-end voice interface - Heart-rate based emotion fusion - Field testing with users ### Prototype UI Overview The proposed conversational frontend interface follows a chatbot-style layout and emphasizes emotional awareness, real-time refinement, and explainable suggestions. ![77d8de245fd1756c1fe5e6c42b7ec86f5a446f6b6927b4d40b60cca0d6c87379](https://hackmd.io/_uploads/SyCXByKC1g.png) --- --- ## Deliverables 1. **Conversational Frontend Interface** - A responsive web-based UI built with React + Tailwind, styled like a chatbot (e.g., ChatGPT). - Enables users to enter natural language queries (e.g., “play something chill”), receive music recommendations, and interact with playlists. - Includes quick buttons for modifying mood, energy, genre, and toggling privacy mode. 2. **NLP Emotion & Intent Understanding Module** - Lightweight NLP component capable of detecting user mood, intent (e.g., “add jazz”), and context (e.g., “for studying”). - Powered by DistilBERT or an external LLM via API for prompt-based classification. - Supports multilingual inputs (extendable). 3. **Personalized Playlist Generator** - Generates mood-aligned playlists by mapping emotional cues to Spotify audio features (valence, energy, tempo, etc.). - Uses Spotify’s recommendation engine, user’s listening history, and genre taxonomy. - Supports multi-stage playlist arcs (e.g., emotional trajectory modeling). - Re-ranks tracks based on implicit or explicit feedback (e.g., skipped or liked songs). - Adjustable with follow-up inputs like “make it more energetic”. 4. **Spotify API Integration Layer** - Handles OAuth2 authentication and secure token management. - Provides access to user metadata (liked tracks, playlists, top artists). - Enables playlist creation, modification, and real-time update. 5. **Exploratory (Privacy) Mode** - Allows users to use the recommender without logging into Spotify. - Uses open datasets (e.g., Moodify) and Spotify’s generic genre/mood seed data. - Ensures full functionality without accessing personal data. 6. **Real-time Feedback & Refinement System** - Supports interactive commands like “more upbeat”, “add lofi”, “remove vocals”. - Refines current playlist using updated audio feature parameters. - Prevents full regeneration unless explicitly requested by user. 7. **Modular Backend API Service** - RESTful API endpoints to handle frontend queries, AI inference, playlist logic, and Spotify communication. - Designed to be stateless and scalable. - Includes internal logging and user session handling. 8. **Developer Documentation & Deployment Guide** - Setup instructions, module overviews, and API usage examples. - Includes annotated code, architecture diagram, environment setup, and example queries. - Helps contributors extend or deploy the system easily. 9. **Proof-of-Concept Demo & Video Walkthrough** - A short recorded demo showing real-time interaction with the agent, playlist generation, and refinement flow. - Serves as a milestone check and onboarding aid for community contributors. --- ## Implementation Modules ### system Architecture ![f949cec662809d76c5e5d6036f047faa548fae2b5df18af5d9b328d7fd9e803d](https://hackmd.io/_uploads/SyJOrkKAkl.jpg) --- ### Module 0: Spotify OAuth2 Token Manager #### Goal To securely and automatically manage access to the Spotify Web API by implementing a robust OAuth2 token handling system. This module enables the rest of the application (playlist generation, refinement, metadata retrieval) to interact with Spotify without manual token refresh or reauthentication. #### Problem Spotify's access tokens expire every 3600 seconds (1 hour). If not refreshed in time, API calls will fail with authorization errors. Manually updating access tokens is error-prone, disrupts workflow, and breaks automated systems. Moreover, hardcoding tokens poses security and privacy risks. #### Solution & Implementation This module uses the full Spotify OAuth2 authorization code flow with refresh tokens and automatic expiry checking. It: - Loads client credentials from a `.env` file - Launches a local Flask server to receive the OAuth callback - Exchanges authorization code for access & refresh tokens - Saves token information in a local `token.json` - Automatically refreshes the token when expired #### File Structure ``` ~/gsoc_spotify_agent/ ├── spotify_auth/ │ ├── token_manager.py # Token logic (get, refresh, save) │ ├── secrets.env # Contains client ID, secret, redirect URI │ └── token.json # Auto-generated on first login └── token_test.py # Usage example and verification ``` #### Code Snippet ```python from spotify_auth.token_manager import SpotifyAuth auth = SpotifyAuth() token = auth.get_access_token() print("Access Token:", token[:50], "...") ``` #### Authorization Flow 1. On first run, opens browser for user to log in via Spotify 2. Redirects to `http://localhost:8888/callback` 3. Flask app intercepts the code and completes the handshake 4. Writes access + refresh token to disk 5. Subsequent runs auto-refresh expired tokens #### Example Output ``` Access Token: BQCJv9kiaDKPTtHBMuY0hnn6Tzu5M8jU2UVBUnMVlEsduRvONJ ... ``` #### Extensibility Plan - Token expiration fallback for API calls (auto retry) - Multi-user support by storing tokens under user ID - Token encryption for safer storage - GUI login flow for desktop use #### Summary The token manager is the backbone of all authenticated Spotify API operations in this project. It eliminates friction, ensures API reliability, and aligns with OAuth2 best practices. Every module that talks to Spotify depends on this layer, and it has been built with extensibility, security, and automation in mind. ### Module 1: Mood & Sentiment Detection Engine #### Goal To interpret user input and infer both affective sentiment (e.g., positive, neutral, negative) and deeper emotional states (e.g., joy, sadness, anxiety, fatigue) through a cognitively-informed NLP pipeline. This module serves as the emotion comprehension layer of the conversational music agent—mirroring how humans infer others' emotions based on language, context, and tone. It provides the foundation for empathy-aligned playlist recommendations. Inspired by appraisal theory in cognitive science, the goal is not only to classify emotion, but also to interpret implicit expressions of mood, even when users do not explicitly name their feelings. #### Problem Emotion detection from text is inherently ambiguous and context-dependent. Users often express emotional states indirectly or with mixed emotional valence: - "I just want to lie down" may imply tiredness or sadness. - "That movie was exhausting, but incredible" conveys emotional contradiction. A purely sentiment-based classifier is insufficient. We need a low-latency model that integrates both sentiment and mood cues, understands implicit emotion, and is robust against hallucinated labels, especially in ambiguous or conversational settings. #### Solution & Implementation We begin with a lightweight transformer-based sentiment model: `cardiffnlp/twitter-roberta-base-sentiment`, which is well-suited for casual, emotionally rich language. ```python # sentiment_api.py from fastapi import FastAPI from pydantic import BaseModel from transformers import AutoTokenizer, AutoModelForSequenceClassification from scipy.special import softmax import torch app = FastAPI() MODEL = "cardiffnlp/twitter-roberta-base-sentiment" tokenizer = AutoTokenizer.from_pretrained(MODEL) model = AutoModelForSequenceClassification.from_pretrained(MODEL) class TextRequest(BaseModel): text: str @app.post("/sentiment") def get_sentiment(req: TextRequest): encoded = tokenizer(req.text, return_tensors="pt") logits = model(**encoded).logits probs = softmax(logits.detach().numpy()[0]) label = ["Negative", "Neutral", "Positive"][probs.argmax()] confidence = float(probs.max()) return { "text": req.text, "sentiment": label, "confidence": round(confidence, 3) } ``` #### Example Output (via Swagger UI) ```json POST /sentiment { "text": "I'm so happy I could cry!" } Response: { "text": "I'm so happy I could cry!", "sentiment": "Positive", "confidence": 0.99 } ``` #### Extensibility Plan To better reflect real-world emotion, this module is designed with the following upgrades in mind: - Add fine-grained emotion classification (`joy`, `anger`, `fear`, etc.) using the `go_emotions` dataset or `m3e` embeddings + KNN clustering - Integrate voice-based emotional prosody analysis for future multimodal input - Incorporate multi-turn context tracking to capture temporal emotion shifts, enabling richer cognitive modeling (e.g., frustration buildup or mood transitions) - Explore emotion commonsense reasoning models (e.g., `COMET` or `EmoCause`) for implicit cause-effect emotion detection #### Summary This module forms the first layer of affective intelligence for the Spotify music agent. Beyond mere sentiment tagging, it introduces a cognitively grounded approach to emotion detection—modeling how humans interpret affect through language. It is API-ready, real-time, and designed for modular integration with playlist generation and user feedback loops. By anchoring this component in both NLP and cognitive science, we ensure that future playlist recommendations are not just technically relevant, but emotionally resonant and contextually appropriate. --- ### Module 2: Playlist Generator with Spotify API (Advanced Strategy) #### Goal To generate highly personalized and emotionally aligned Spotify playlists using inputs from the NLP engine (Module 1). This module constructs not just a list of songs, but a curated music journey tailored to the user's mood, energy, and contextual needs. #### Problem Users typically require 2–3 hours of music in one session (~50+ tracks), but Spotify’s recommendation API returns a limited number of suggestions with diminishing quality. A single batch of recommendations rarely suffices to ensure emotional accuracy, musical diversity, and satisfaction. Additionally, static or shallow queries (e.g., based solely on valence and energy) cannot fully capture the complexity of emotional states, context (like weather, activity, or time), or personal taste. #### Solution & Implementation We treat Spotify’s API not as a final recommender, but as a music pool. Our approach consists of multi-stage sampling, mood trajectory modeling, intelligent reranking, and robust playlist assembly. ##### Key Components 1. **Mood Trajectory Modeling** - Use user emotion input to generate 2–4 mood stages (e.g., "tired but want to feel energetic" → low valence/energy → mid → high). - Each stage has its own valence/energy targets and contributes ~15–20 tracks. 2. **Multi-Round Genre Sampling** - For each stage, query Spotify `/v1/recommendations` multiple times with varied `seed_genres`, `seed_artists`, and valence/energy. - Introduce ~20% "surprise genres" to promote diversity. - Aim to collect a candidate pool of 100–150 songs. 3. **User Profile Matching** - Retrieve user top artists/tracks using `GET /v1/me/top/{type}`. - Score recommendations based on acoustic similarity (e.g., danceability, tempo, mood tags) or embedding proximity. 4. **Context-Aware Filtering & Reranking** - Filter tracks using context keywords (e.g., remove vocals for "deep work"). - Prioritize songs that align with the current activity or emotional goal. - Deduplicate artists, avoid recent repeats, and ensure smooth transitions. 5. **Playlist Assembly** - Create a new playlist via `POST /v1/users/{user_id}/playlists`. - Add top 50–70 filtered and scored tracks using `POST /v1/playlists/{playlist_id}/tracks`. #### API Endpoint Availability Verification | Endpoint | Purpose | Verified | Notes | |----------|---------|----------|-------| | `GET /v1/recommendations` | Base music recommendation | ✅ Yes | Used multiple times with different seeds per stage. | | `POST /v1/users/{user_id}/playlists` | Create playlist | ✅ Yes | Named dynamically (e.g., "Mood Journey – Apr 5"). | | `POST /v1/playlists/{playlist_id}/tracks` | Add tracks | ✅ Yes | Supports batch updates. | | `GET /v1/audio-features/{id}` | Track features | ✅ Yes | Used for similarity scoring and filtering. | | `GET /v1/available-genre-seeds` | Validate genres | ✅ Yes | Filters invalid genre tags. | | `GET /v1/me/top/{type}` | User top artists/tracks | ✅ Yes | Powers personalization layer. | #### Example Usage Snippet ```python uris = [] for mood in mood_trajectory: for genre in seed_genres + surprise_genres: batch = get_recommendations(seed_genres=[genre], **mood) uris.extend(batch) scored = rerank_with_profile(uris, user_profile, context) playlist = scored[:60] create_playlist(name="Mood Journey", tracks=playlist) ``` #### Full PoC Script (Validated) The following script demonstrates a full working implementation of the advanced playlist strategy. It generates a 60-track playlist using mood trajectory modeling, multi-round genre sampling, and dynamic playlist creation via the Spotify API. ```python from spotify_auth.token_manager import SpotifyAuth import requests import random ACCESS_TOKEN = SpotifyAuth().get_access_token() USER_ID = "your_user_id" HEADERS = { "Authorization": f"Bearer {ACCESS_TOKEN}", "Content-Type": "application/json" } def get_recommendations(seed_genres, valence, energy, limit=10): url = "https://api.spotify.com/v1/recommendations" params = { "seed_genres": ",".join(seed_genres), "target_valence": valence, "target_energy": energy, "limit": limit } res = requests.get(url, headers=HEADERS, params=params) return [track["uri"] for track in res.json()["tracks"]] def create_playlist(name, track_uris): create_url = f"https://api.spotify.com/v1/users/{USER_ID}/playlists" payload = {"name": name, "description": "Auto-generated by MeowWave", "public": False} res = requests.post(create_url, headers=HEADERS, json=payload) playlist_id = res.json()["id"] # Batch add add_url = f"https://api.spotify.com/v1/playlists/{playlist_id}/tracks" for i in range(0, len(track_uris), 100): chunk = {"uris": track_uris[i:i+100]} requests.post(add_url, headers=HEADERS, json=chunk) return f"https://open.spotify.com/playlist/{playlist_id}" # ----- Full Example Flow ----- mood_trajectory = [ {"valence": 0.3, "energy": 0.2}, {"valence": 0.5, "energy": 0.5}, {"valence": 0.8, "energy": 0.7} ] seed_genres = ["ambient", "acoustic"] surprise_genres = random.sample(["indie", "folk", "jazz"], 1) all_uris = [] for mood in mood_trajectory: for genre in seed_genres + surprise_genres: tracks = get_recommendations([genre], mood["valence"], mood["energy"], limit=10) all_uris.extend(tracks) playlist_url = create_playlist("Mood Journey – Apr 5", all_uris[:60]) print("✅ Playlist created:", playlist_url) ``` #### Inputs - `seed_genres`, `seed_artists`, `seed_tracks`: Inferred from NLP module and user history. - `valence`, `energy`: Per stage of mood trajectory. - `context_keywords`: Affect ranking and filtering logic. #### Output - A Spotify playlist (new or updated) with ~50 emotionally coherent and contextually appropriate songs. #### Extensibility Plan - Use embeddings or vector similarity for deeper personalization. - Incorporate Module 4 feedback for active playlist evolution. - Allow real-time regeneration via conversational agent (Module 5). - Explore integration with local LLMs. #### Summary Unlike naive playlist generators, this module models musical journeys, collects from diverse sources, filters and scores candidates, and delivers long-form playlists that reflect a user’s emotional context. It combines Spotify’s engine with custom logic and future extensibility hooks, forming the core of a deeply personalized music assistant. Future work will explore replacing Spotify’s API with a fully self-trained recommender using custom embeddings and collaborative filtering based on anonymized user taste graphs. --- ### Module 3: Smart Recommendation Fallback Engine #### Goal To ensure the music agent can always provide recommendations, even when Spotify's official `/v1/recommendations` API fails due to insufficient listening history, region lock, or cold-start limitations. This module introduces a fallback mechanism based on predefined mood-tagged tracks. #### Problem Spotify’s recommendation API can unpredictably return 404 errors when there is insufficient listening data, or for newly created or low-activity accounts. This failure breaks the user experience and interrupts the emotion-to-music pipeline. #### Solution & Proof of Concept (PoC) We define a hard-coded track pool categorized by mood (e.g., "happy", "sad", "chill"). If the API call to Spotify fails, we randomly sample from the corresponding category and populate a playlist via Spotify’s playlist management API. This guarantees uninterrupted functionality and user satisfaction. #### Fallback Track Pool ```python MOCK_TRACKS = { "happy": [ "5W3cjX2J3tjhG8zb6u0qHn", # Ed Sheeran - Shape of You "3AhXZa8sUQht0UEdBJgpGc", # Pharrell Williams - Happy "7y7w4M3zP28X4PjB0KukLx", # Justin Timberlake - Can't Stop The Feeling! ], "sad": [ "2dLLR6qlu5UJ5gk0dKz0h3", # Adele - Someone Like You "4JpKVNYnVcJ8tuMKjAj50A", # Sam Smith - Too Good At Goodbyes "1rqqCSm0Qe4I9rUvWncaom", # Lewis Capaldi - Someone You Loved ], "chill": [ "3Zwu2K0Qa5sT6teCCHPShP", # Billie Eilish - ocean eyes "2eBnhLqmuM0r8C3O1aYJEa", # Khalid - Location "0rCYPc082fS0P8U7EfUwCk", # Mac Miller - Good News ] } ``` #### PoC Code Snippet ```python from spotify_auth.token_manager import SpotifyAuth import requests import random # Return fallback tracks based on mood def recommend_by_mood(mood="chill", n=3): track_pool = MOCK_TRACKS.get(mood, MOCK_TRACKS["chill"]) return random.sample(track_pool, k=min(n, len(track_pool))) # Add selected tracks to Spotify playlist def add_tracks_to_playlist(track_ids, playlist_id): token = SpotifyAuth().get_access_token() headers = {"Authorization": f"Bearer {token}", "Content-Type": "application/json"} url = f"https://api.spotify.com/v1/playlists/{playlist_id}/tracks" payload = {"uris": [f"spotify:track:{tid}" for tid in track_ids]} res = requests.post(url, headers=headers, json=payload) return res.status_code == 201 # PoC test run if __name__ == "__main__": mood = "happy" playlist_id = "1cPzpV6Mt5UvBplFTad5rh" track_ids = recommend_by_mood(mood) print("🎧 Tracks to add:", track_ids) if add_tracks_to_playlist(track_ids, playlist_id): print("✅ Added fallback recommendations to playlist") else: print("❌ Failed to add tracks") ``` #### Extensibility Plan - Move fallback tracks to `tracks.json` or `tracks.csv` for maintainability - Integrate fuzzy matching for mood inputs (e.g., "relaxed" → "chill") - Add metadata caching for song titles and cover art - Connect fallback system to sentiment detection module - Support automatic playlist naming: e.g., “Your Happy Mix – March 29” #### Summary This fallback module guarantees playlist generation regardless of Spotify’s internal recommendation availability. It builds trust with users and provides a foundation for more advanced hybrid recommendation systems that mix AI, NLP, and curated datasets. --- ### Module 4: Playlist Feedback & Personalization Engine #### Goal To close the loop between the AI music agent and the user by enabling dynamic feedback, preference learning, and real-time playlist personalization. This module empowers users to like/dislike specific tracks or request changes via natural language, thereby improving future recommendations and enabling adaptive listening experiences. #### Problem Spotify's API does not provide deep reinforcement learning mechanisms out-of-the-box. While initial recommendations may be useful, users often want to guide the music selection based on subjective experience. Without a feedback loop, the system cannot improve or adapt to user preferences over time. Additionally, not all users are comfortable with full personalization; thus, the system must support opt-in learning and an alternative "exploration mode." #### Solution & Implementation The system introduces a feedback mechanism that captures user preferences (e.g., "like" or "dislike") either through UI interactions or conversational feedback. These preferences are stored locally as JSON-based user profiles. A feedback-aware playlist generator uses these preferences to: - Prioritize previously liked songs or similar tracks - Avoid disliked or skipped songs - Adjust valence/energy/genre weights based on user preference clusters If a user opts out of personalization, the system will use an exploratory mode based on mood, genre, and random fallback selections. #### Proof of Concept (PoC) We simulate user feedback on track IDs and update a local preference store (`preferences.json`): ```json { "likes": ["5W3cjX2J3tjhG8zb6u0qHn", "3Zwu2K0Qa5sT6teCCHPShP"], "dislikes": ["2dLLR6qlu5UJ5gk0dKz0h3"] } ``` This preference store can be loaded by the playlist generator to reweight or re-rank recommendations in future sessions. #### PoC Python Script ```python # module4_feedback_test.py from typing import Dict, List import json import os PROFILE_PATH = "user_data/preferences.json" FEEDBACK_EVENTS = [ {"track_id": "5W3cjX2J3tjhG8zb6u0qHn", "liked": True}, {"track_id": "2dLLR6qlu5UJ5gk0dKz0h3", "liked": False}, {"track_id": "3Zwu2K0Qa5sT6teCCHPShP", "liked": True} ] os.makedirs("user_data", exist_ok=True) if not os.path.exists(PROFILE_PATH): with open(PROFILE_PATH, "w") as f: json.dump({"likes": [], "dislikes": []}, f) def update_preferences(events: List[Dict]): with open(PROFILE_PATH, "r") as f: prefs = json.load(f) for event in events: tid = event["track_id"] if event["liked"] and tid not in prefs["likes"]: prefs["likes"].append(tid) if tid in prefs["dislikes"]: prefs["dislikes"].remove(tid) elif not event["liked"] and tid not in prefs["dislikes"]: prefs["dislikes"].append(tid) if tid in prefs["likes"]: prefs["likes"].remove(tid) with open(PROFILE_PATH, "w") as f: json.dump(prefs, f, indent=2) if __name__ == "__main__": update_preferences(FEEDBACK_EVENTS) print("Preferences updated. Current profile:") with open(PROFILE_PATH, "r") as f: print(json.dumps(json.load(f), indent=2)) ``` #### Feedback-Based Reranking We add a reranking step that prioritizes recommended tracks based on the feedback profile. Each track is scored based on its valence plus a weight determined by whether it has been liked or disliked in the past. ```python # feedback_reranker.py import json from typing import List, Dict PREF_PATH = "user_data/preferences.json" RECOMMENDED_TRACKS = [ {"id": "5W3cjX2J3tjhG8zb6u0qHn", "valence": 0.9}, {"id": "2dLLR6qlu5UJ5gk0dKz0h3", "valence": 0.3}, {"id": "1rqqCSm0Qe4I9rUvWncaom", "valence": 0.5}, {"id": "3Zwu2K0Qa5sT6teCCHPShP", "valence": 0.6}, {"id": "7y7w4M3zP28X4PjB0KukLx", "valence": 0.7} ] LIKE_BOOST = 1.0 DISLIKE_PENALTY = -1.0 def rerank_with_feedback(tracks: List[Dict], preferences: Dict) -> List[Dict]: for t in tracks: tid = t["id"] score = t["valence"] if tid in preferences["likes"]: score += LIKE_BOOST elif tid in preferences["dislikes"]: score += DISLIKE_PENALTY t["score"] = round(score, 4) return sorted(tracks, key=lambda x: x["score"], reverse=True) if __name__ == "__main__": with open(PREF_PATH, "r") as f: prefs = json.load(f) reranked = rerank_with_feedback(RECOMMENDED_TRACKS, prefs) print("Reranked Recommendations (Top First):") for t in reranked: print(f"{t['id']} (score={t['score']})") ``` #### Extensibility Plan - Add conversation-based feedback interpretation (e.g., "this one is too slow" → adjust tempo) - Sync preferences with cloud (for persistence across devices) - Use clustering/embedding-based similarity search to expand liked songs - Add support for opt-out toggle in UI or config - Integrate scoring into backend API responses #### Summary This module provides the memory and learning layer of the AI agent. It enhances the user experience over time by adapting to personal tastes and enabling feedback-driven refinement. It also supports privacy-conscious users by offering a non-personalized exploratory mode. This is a key differentiator from static recommendation agents and allows the system to evolve with each interaction. --- ### Module 5: Conversational Interface & Dialogue Controller #### Goal To enable emotionally aware, real-time, and intuitive interaction between users and the AI music agent through a chatbot-style interface. This module combines two LLM-powered components: one that generates warm, human-like responses, and another that parses actionable intents for playlist control. #### Problem Users often prefer expressing music preferences in free-form natural language ("I'm sad but want to feel better") rather than technical parameters (valence, energy, etc.). Traditional keyword-based systems fail to capture subtle intent, mood shifts, or emotional nuance. #### Solution & Implementation This module employs a **two-layer conversational architecture**: 1. **LLM Response Generator** – Generates empathetic, mood-matching replies using a locally deployed open-source LLM (e.g., `Mistral-7B-Instruct`). 2. **Intent & Emotion Extractor** – Parses the user's message into structured data for downstream modules (e.g., mood, energy, genre hints) using prompt engineering or fine-tuned classifiers. Both layers operate asynchronously or can be fused into one when using a multi-turn capable model. #### CLI PoC: Empathetic Response Generator ```python # dialogue_agent.py import os, torch from transformers import AutoTokenizer, AutoModelForCausalLM MODEL_NAME = "mistralai/Mistral-7B-Instruct-v0.1" CACHE_DIR = "/tmp/mistral_cache" HF_TOKEN = os.environ.get("HF_TOKEN") # Load model model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, cache_dir=CACHE_DIR, token=HF_TOKEN, torch_dtype=torch.float16, device_map="auto") tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, cache_dir=CACHE_DIR, token=HF_TOKEN) def generate_response(user_input: str) -> str: prompt = f"""<s>[INST] You are an empathetic AI music assistant named MeowWave. Your task is to understand the user's emotional state and suggest music that matches their mood. Keep your response warm, concise, and avoid repeating the input. Always offer a genre or vibe that fits. Now respond to this message:\n\"{user_input}\" [/INST]""" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=100, do_sample=True, top_p=0.9, temperature=0.7) decoded = tokenizer.decode(outputs[0], skip_special_tokens=True) return decoded.split("[/INST]")[-1].strip() ``` #### Example A: Emotional Support Interaction ``` 🎧 Mistral MusicBot is ready! Type 'exit' to quit. You: I’m feeling heartbroken and don’t want to think too much. Agent: I'm sorry to hear that you're feeling heartbroken. It's important to take time for yourself and give yourself space to heal. Music can be a powerful tool in helping you through this difficult time. I suggest listening to some soft, soulful ballads in the pop or singer-songwriter genre. These genres often feature lyrics that express emotions of heartbreak and loss, and the gentle melodies can be soothing and comforting. ``` #### Example B: Productivity Context Interaction ``` 🎧 Mistral MusicBot is ready! Type 'exit' to quit. You: I'm starting a deep work session. Any good tunes? Agent: Sure thing! For a productive work session, I recommend some instrumental music. Jazz, classical, or ambient electronic music would be great options. Would you like me to play some of that for you? ``` #### Intent & Emotion Extraction (Structured Layer) Example output from a separate intent parser LLM (or future fine-tuned classifier): ```json { "mood": "heartbroken", "valence": 0.2, "energy": 0.3, "context": ["emotional support"], "recommended_genres": ["singer-songwriter", "ballads"] } ``` This can be passed directly to Module 2's playlist generation pipeline. #### Extensibility Plan - Introduce memory/state (e.g., recent feedback, persistent mood) - Add LLM role customization: "DJ Chillwave" vs "Motivation Coach" - Enable GUI toggles for personality, tone, and recommendation style - Multi-turn context tracking for evolving dialogue sessions #### Summary This module transforms the system from a static recommender to a truly conversational music agent. With both warm, human-like dialogue and actionable data extraction, it becomes a compelling emotional assistant and playlist navigator. The dual-layer design ensures flexibility, modularity, and the ability to evolve toward more sophisticated use cases. --- ### Module 6: Explainable Recommendation & Transparency Engine #### Goal To increase user trust, transparency, and engagement by providing human-readable explanations for why specific tracks are recommended. This module bridges the gap between black-box recommendation logic and interpretable, user-facing feedback. It draws on cognitive theories of explanation, transparency, and emotional alignment to enhance human-agent understanding. Informed by affective computing and human-centered AI design, the module aims to deliver justifications that feel emotionally resonant, personalized, and context-aware. #### Problem Users often feel disconnected from recommendation systems due to their opaque nature. Without transparency, users are less likely to trust or engage meaningfully with the system. The lack of interpretability limits opportunities for feedback, learning, and affective bonding. Psychological studies indicate that users are more likely to trust and follow AI decisions when they understand the rationale behind them. For emotionally driven use cases like music, explanations aligned with a user’s current state and preferences can significantly boost satisfaction and perceived empathy. #### Solution & Implementation We implement a dual-layer explanation engine that provides: - Textual justifications based on user mood, preferences, and track features - Visual cues, such as radar charts and affective icons, to intuitively convey audio profiles These explanations are derived from a combination of: - User mood/sentiment (from Module 1) - Preferences (likes/dislikes, from Module 4) - Track-level features (valence, energy, genre, acousticness, etc.) - Similarity to known liked tracks (embedding-based or rule-based) Explanations are generated through structured templates and optionally enhanced via large language models (LLMs), with future extensions into real-time adaptive narrative generation. #### Example: Text-Based Explanation "This track was selected because it aligns with your current mood of 'heartbroken' through low valence and tempo. It also resembles your liked artist Adele, and you’ve previously enjoyed similar ballads." *Energy: 0.24 | Valence: 0.18 | Acousticness: High | Lyrics: Emotional breakup theme* #### Example: Visual Radar Chart Comparison Compare the track with your historical preferences across key dimensions. ![Adobe Express - file (1)](https://hackmd.io/_uploads/S1MBQYTT1l.png) #### PoC Implementation (Rule-Based Explanation Generator) ```python # explain_track.py def explain(track_id, features, user_profile): mood = user_profile.get("mood", "neutral") energy = user_profile.get("energy", 0.5) genre = user_profile.get("genre", "pop") explanation = f"This track matches your recent mood ({mood})" if features.get("energy"): if features["energy"] > energy: explanation += ", and has slightly more energy to elevate your experience." else: explanation += ", and has calmer energy to match your preference." if features.get("genre") == genre: explanation += f" It also belongs to your preferred genre: {genre}." if features.get("similar_to_liked"): explanation += " Similar to songs you’ve liked before." return explanation # Example call sample_features = {"energy": 0.7, "genre": "pop", "similar_to_liked": True} user_profile = {"mood": "happy", "energy": 0.6, "genre": "pop"} print(explain("track123", sample_features, user_profile)) ``` #### Integration in UI Each recommended track can include a collapsible explanation panel: ``` Track Title – Artist Name "This track aligns with your mood (calm), has low energy, and fits your preferred genre: jazz." Acousticness: High | Energy: Low | Genre Match: Yes Explain: [Expand] ``` #### Extensibility Plan - Integrate cosine similarity with liked-song embeddings for personalized traceability - Support real-time explanation generation using transformer-based LLMs - Add visual explanation components (radar chart, color-coded mood tags) - UI toggles for "Why was this recommended?" to encourage curiosity-driven interaction #### Summary This module transforms the music agent from a black-box predictor into a transparent, explainable system. By surfacing structured, emotionally grounded explanations, it builds trust, supports reflective feedback, and deepens user engagement. Explanations serve as cognitive anchors for users to understand and influence the agent's behavior—making the system not only smarter, but also more relatable and human-aligned.

Import from clipboard

Paste your markdown or webpage here...

Advanced permission required

Your current role can only read. Ask the system administrator to acquire write and comment permission.

This team is disabled

Sorry, this team is disabled. You can't edit this note.

This note is locked

Sorry, only owner can edit this note.

Reach the limit

Sorry, you've reached the max length this note can be.
Please reduce the content or divide it to more notes, thank you!

Import from Gist

Import from Snippet

or

Export to Snippet

Are you sure?

Do you really want to delete this note?
All users will lose their connection.

Create a note from template

Create a note from template

Oops...
This template has been removed or transferred.
Upgrade
All
  • All
  • Team
No template.

Create a template

Upgrade

Delete template

Do you really want to delete this template?
Turn this template into a regular note and keep its content, versions, and comments.

This page need refresh

You have an incompatible client version.
Refresh to update.
New version available!
See releases notes here
Refresh to enjoy new features.
Your user state has changed.
Refresh to load new user state.

Sign in

Forgot password

or

By clicking below, you agree to our terms of service.

Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
Wallet ( )
Connect another wallet

New to HackMD? Sign up

Help

  • English
  • 中文
  • Français
  • Deutsch
  • 日本語
  • Español
  • Català
  • Ελληνικά
  • Português
  • italiano
  • Türkçe
  • Русский
  • Nederlands
  • hrvatski jezik
  • język polski
  • Українська
  • हिन्दी
  • svenska
  • Esperanto
  • dansk

Documents

Help & Tutorial

How to use Book mode

Slide Example

API Docs

Edit in VSCode

Install browser extension

Contacts

Feedback

Discord

Send us email

Resources

Releases

Pricing

Blog

Policy

Terms

Privacy

Cheatsheet

Syntax Example Reference
# Header Header 基本排版
- Unordered List
  • Unordered List
1. Ordered List
  1. Ordered List
- [ ] Todo List
  • Todo List
> Blockquote
Blockquote
**Bold font** Bold font
*Italics font* Italics font
~~Strikethrough~~ Strikethrough
19^th^ 19th
H~2~O H2O
++Inserted text++ Inserted text
==Marked text== Marked text
[link text](https:// "title") Link
![image alt](https:// "title") Image
`Code` Code 在筆記中貼入程式碼
```javascript
var i = 0;
```
var i = 0;
:smile: :smile: Emoji list
{%youtube youtube_id %} Externals
$L^aT_eX$ LaTeX
:::info
This is a alert area.
:::

This is a alert area.

Versions and GitHub Sync
Get Full History Access

  • Edit version name
  • Delete

revision author avatar     named on  

More Less

Note content is identical to the latest version.
Compare
    Choose a version
    No search result
    Version not found
Sign in to link this note to GitHub
Learn more
This note is not linked with GitHub
 

Feedback

Submission failed, please try again

Thanks for your support.

On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

Please give us some advice and help us improve HackMD.

 

Thanks for your feedback

Remove version name

Do you want to remove this version name and description?

Transfer ownership

Transfer to
    Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

      Link with GitHub

      Please authorize HackMD on GitHub
      • Please sign in to GitHub and install the HackMD app on your GitHub repo.
      • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
      Learn more  Sign in to GitHub

      Push the note to GitHub Push to GitHub Pull a file from GitHub

        Authorize again
       

      Choose which file to push to

      Select repo
      Refresh Authorize more repos
      Select branch
      Select file
      Select branch
      Choose version(s) to push
      • Save a new version and push
      • Choose from existing versions
      Include title and tags
      Available push count

      Pull from GitHub

       
      File from GitHub
      File from HackMD

      GitHub Link Settings

      File linked

      Linked by
      File path
      Last synced branch
      Available push count

      Danger Zone

      Unlink
      You will no longer receive notification when GitHub file changes after unlink.

      Syncing

      Push failed

      Push successfully