# Virtual Waifu ## Week 0D ---- ## 何謂 Virtual Waifu ![](https://i.imgur.com/L2eB9s4.png) [Source](https://tinyurl.com/4m45ctmw) ---- ## 我婆三要素 + 會動 (Visual) + 會講 (Text-To-Speech, TTS) + 會聊天 (Chat) ---- ## 會動 - 視覺 ![](https://i.imgur.com/pd0fyCTl.png) + [Midjourney](https://www.midjourney.com/home/) + [HuggingFace Diffusion Models](https://huggingface.co/andite/anything-v4.0) ---- ## 會講 - 聽覺 + 固定人聲:Siri, Google 小姐 + 複製人聲:[Voice Clone](https://github.com/CorentinJ/Real-Time-Voice-Cloning), [ElevenLabs](https://beta.elevenlabs.io/) ---- ## 會聊天 - 幻覺 ![](https://i.imgur.com/OJZ2Zjf.png) [ChatGPT](https://chat.openai.com/chat) --- # Diffusion Model ---- ## 說文解字 Diffusion 擴散 ---- ## 什麼擴散 1. 先對圖片加一點雜訊 2. 讓模型學習消噪 3. 加入更多雜訊 ![](https://i.imgur.com/p7PWXBDl.png) [來源:李宏毅教授](https://youtu.be/azBugJzmz-o) ---- ## 擴散過程 ![](https://i.imgur.com/CxQaMQJ.gif) ---- ## ControlNet + ControlNet 可以控制模型的輸出 + 現在常見的應用是捕捉人物的骨架 [Demo: ControlNet + Anything V4](https://huggingface.co/spaces/hysts/ControlNet-with-Anything-v4) ---- ## 更猛的 multiControlNet [跟著節奏動起來](https://tinyurl.com/y6a4ppa4) --- # 語言模型 ---- ## 什麼是語言模型 評估一段文句的機率 > 再給我兩份蔥,讓我把記憶煎成餅 vs. > 再給我兩分鐘,讓我把記憶結成冰 ---- ## 近代語言模型的種類 + N-Gram Model + Transformer Model + Encoder-Only + [Simple Demo](https://huggingface.co/martin-ha/toxic-comment-model) + Encoder-Decoder (Seq2Seq) + [Google 翻譯](https://tinyurl.com/328zzuvd) + Decoder-Only + [ChatGPT](https://chat.openai.com/) ---- ## 超大型語言模型 ![](https://i.imgur.com/5O0CqQw.png) Large Language Model ---- ## 這模型到底多大 若參數的資料型態為 Float32 = 32 Bits = 4 Bytes 1750 億的參數量的模型將近 800 GB ---- ## LLaMA GPU 用 2048 張 A100 GPU 訓練 21 天 > When training a 65B-parameter model, our code processes around 380 tokens/sec/GPU on 2048 A100 GPU with 80GB of RAM. This means that training over our dataset containing 1.4T tokens takes approximately 21 days. [Meta AI: LLaMA Paper](https://arxiv.org/abs/2302.13971) ---- ## OH MY GPU + A100 擁有 40/80 GB 的 VRAM + [Jarvislabs.ai](https://jarvislabs.ai/) + 3090/4090 擁有 24 GB 的 VRAM + [原價屋](https://coolpc.com.tw/tmp/1681479452956683.htm) + [Colab](https://colab.research.google.com/?hl=en) 提供 Tesla T4 GPU (16 GB VRAM) + 小弟我的 GTX 1080 Ti 僅 12 GB 的 VRAM ---- ## PEFT & LoRA + [Parameter-Efficient Fine-Tuning (PEFT)](https://github.com/huggingface/peft) + [Low-Rank Adaptation (LoRA)](https://github.com/microsoft/LoRA) ---- ## Decomposition Matrices ![](https://i.imgur.com/SmFmINc.gif) 將一個大矩陣拆成小矩陣 ---- ## Bits Bytes 計較 ![](https://i.imgur.com/1kvuTO7l.png) 運用 Quantization 技術將模型硬塞到顯卡裡面 [Source](https://huggingface.co/blog/trl-peft) ---- ## 選擇小一點的語言模型 + BigScience - BLOOM (560M ~ 3B) + EleutherAI - GPT-NeoX (1.3B ~ 20B) + Meta AI - OPT (125M ~ 66B) + Meta AI - LLaMA (7B ~ 65B) + THUDM - GLM (2B ~ 10B) + 壓到 8 Bits 的 60B 模型約 60 GB ---- ## 選擇不同的機器學習框架 + Microsoft [ONNX Runtime](https://github.com/microsoft/onnxruntime) + PyTorch [TensorRT](https://github.com/pytorch/TensorRT) + ggerganov [ggml](https://github.com/ggerganov/ggml) + Alibaba [MNN](https://github.com/alibaba/MNN) ---- ## LLaMA 最近蔚為風潮 + 產生許多 LLaMA (駱駝)的變種 + Alpaca 羊駝 aka 草泥馬 + Vicuna 駱駝 + Koala 無尾熊 + Dolly 桃莉(複製羊) + 來自 [FastChat](https://github.com/lm-sys/FastChat) 的 [Demo](https://chat.lmsys.org/) --- ## 凡你婆者,皆是虛妄 ![](https://i.imgur.com/EaBk2F2.png) [Source](https://memes.tw/wtf/402046)
{"metaMigratedAt":"2023-06-18T01:19:09.745Z","metaMigratedFrom":"YAML","title":"Week 0D - Virtual Waifu","breaks":true,"description":"地獄貓旅行團第 13 週心得分享","slideOptions":"{\"transition\":\"slide\",\"previewLinks\":false}","contributors":"[{\"id\":\"c7cbb212-2c41-4dfa-8d85-f8e7fa769bf1\",\"add\":4292,\"del\":725}]"}
    279 views
   Owned this note