# Recommender with LLM (AOAI service) --- # 推薦系統 --- ## recap...and more detail - Content-based - 尋找「相似」的推薦目標:`cosine_similarity` - 離散資料連續化(向量化):`TfidfVectorizer`, `CountVectorizer` - 文本 -> 向量:視為是一種投影 - 正交(Orthogonality):不要在扭曲的~~地圖上~~空間裡算距離! - 同理:古典機器學習中的迴歸或主成分分析 --- ## recap...and more detail - Collaborative filtering - `user-item matrix` - 使用者行為「洗版」 - 從 `One-hot-encoding` 到 `word2vec` - 需要更具「代表性」的(文本 -> 向量)轉換方式 - 自然語言處理 NLP 提供了解法 --- ## recap...and more detail - Embedding-based - https://medium.com/%E6%95%B8%E5%AD%B8-%E4%BA%BA%E5%B7%A5%E6%99%BA%E6%85%A7%E8%88%87%E8%9F%92%E8%9B%87/%E6%8E%A8%E8%96%A6%E7%B3%BB%E7%B5%B1%E5%AF%A6%E5%8B%99-%E4%B8%80-embedding-%E6%8A%80%E5%B7%A7-a4cc69775b18 - 用戶資訊描繪:user features - `word2vec` - 用戶歷史行為描繪:user-item relationship - `seq2seq` - 用戶社交關係描繪:user-user similarity - 產品資訊描繪:product features - 產品相似度描繪:購物籃分析 - `{XXX}2vec` --- # 大型語言模型 ## LLMs: *L*arge *L*anguage *M*odel*s* 巨量語料訓練出來的模型:pre-trained base model --- ## Embedding model ![](https://cdn.openai.com/new-and-improved-embedding-model/draft-20221214a/vectors-2.svg) https://openai.com/blog/new-and-improved-embedding-model --- ## Practice https://cloudatlas.me/how-to-build-a-basic-recommender-system-using-azure-openai-embeddings-2188e172338 https://github.com/openai/openai-cookbook/blob/main/examples/Recommendation_using_embeddings.ipynb --- ## 更多應用 https://platform.openai.com/docs/guides/embeddings/embeddings - QnA:https://github.com/ruoccofabrizio/azure-open-ai-embeddings-qna - Search - Anomaly detection - Classification --- # Q & A
{"title":"week_3_recommender_w_openai","description":"","contributors":"[{\"id\":\"f86386aa-f010-402c-b40f-4d1d7d6afa8b\",\"add\":1832,\"del\":171}]"}
    205 views