專題 - HackMD

# 專題 ## 8/13 ~ 18 ### 學習內容 >**YUKI** ``` python matplotlib virtualenv虛擬環境 2021ML影片1~4 ``` > **CURRY** ``` ml2021教學影片第21部理解transformer的decoder以及encoder原理理解self-supervise以及BERT(transformer的decoder)模型練習用torchvision的資料集練習簡單的training 練習torch.nn中的模組:Conv2d, MaxPool2d, Sequential... 用tensorboard可視化訓練過程 ``` [pytorch練習(實作CIFAR模型)](https://github.com/Narticleo/PythonPractice/blob/master/pyTorch/CIFAR10/optimizer.py) ![](https://hackmd.io/_uploads/BkByz9Yhn.png) > **ALTHEA** ``` 進度: 2021MLvideo 1~8 pytorch 安裝想專題進度 ``` > **GOOD** ``` 2021MLvideo 1~13 python基礎 ``` ### 專題想法 >**YUKI** >##### 手語辨識 (方便學習手語) >##### 辨識菜餚 (查看剩下的菜可以做出什麼菜餚) > **CURRY** >##### 基於影像辨識的局部即時天氣預測 >>##### [資源1](https://www.cwb.gov.tw/V8/C/W/OBS_Sat.html) >>##### [資源2](http://140.137.32.27/www/) >>##### [參考1](https://www.perform-global.com/automl-use-case-datarobot-cwb/) >##### 可自動偵測可疑行為之監視系統 >##### 面部微表情測謊系統 >>##### 自行蒐集眼睛嘴唇之類的細部變化 >##### 基於影像辨識破解魔術方塊 >>###### 可嘗試自己收集資料 > **ALTHEA** >##### 用影像辨識辨別小雞的性別(羽毛毛色) >> ##### [分辨方法](https://cs-tf.com/how-to-sex-chickens/) >> (目前的論文都是針對小雞的賀爾蒙、基因進行機器學習) >##### 交偷告密鬼:車輛未禮讓行人圍觀偵測+停紅燈違規行為偵測 >> ##### 高雄市已經有這個技術 >##### 基於深度學習語音辨識之現實中的字幕眼鏡 >> ##### 國外有英文的上市公司做英文字幕的投影，可能可以做成中文的 > ##### 影像辨識石材資料庫 > **GOOD** --- ### 開會討論 08/16/2023 #### 專題意見 > YUKI 手語要想創新的應用菜餚可以想想把剩下的食材生成一道菜 >CURRY >ALTHEA 影像辨識辨別小雞的性別(羽毛毛色):　資料來源　要找 ~~交偷告密鬼:車輛未禮讓行人圍觀偵測+停紅燈違規行為偵測~~ 基於深度學習語音辨識之現實中的字幕眼鏡: 找model 影像辨識石材資料庫 >GOOD 求生神器–蘑菇辨識器 (網上已有成品) 開掛神器–動作預測系統 (不太現實) 建議: 邊學邊想 #### 下周要學 ```pytorch matplolib Numpy Git 影片１３ ``` --- ## 8/19 ~ 26 ### 學習內容 >**YUKI** ``` pytorch ML (10) ``` > **CURRY** ``` pyTorch的資源1學完機器學習22~25 ``` [完整訓練](https://github.com/Narticleo/PythonPractice/blob/master/pyTorch/train_cuda.py) [TensorBoard](http://localhost:6006/?darkMode=true#timeseries) > **ALTHEA** ``` ml2021教學影片 9~13 Pooling 的用途 CNN 與影像辨識 CNN 與 self attention 的差異 Transformer 的 Encoder 原理 Transformer 的 Decoder 原理練習matplotlib ``` > **GOOD** ``` 簡易深度程式(卡在overfitting調不了) matplotlib基礎 ``` ![](https://hackmd.io/_uploads/S1qDB_mT3.png) ![](https://hackmd.io/_uploads/HkwwBdQph.png) --- ### 開會討論 #### 小組會議 ``` 對 ml 影片進行討論報告大家這周的學習進度對專題的題目提出意見，並統整:魔方、菜餚 ``` lab會議 ``` 魔方: 強化式學習，人工智慧菜餚: ChatGPT 生成食譜，生成完成的圖片找一下相關的論文看有沒有 Overfitting 要看 Accurancy 跟 Testing Loss 都印出來 ``` --- ### 下周目標 ``` 單槓吊到第二下 ML影片~17 再多想一些專題 ``` ## 8/23 ~30 ### 學習內容 >**YUKI** ``` PyTorch環境建置 TensorBoard使用 ML(11) ``` --- > **CURRY** ### ML(26-34) - #### 機器的可解釋性: 可以透過實驗在訓練好的每層網路中提取輸出進行比對，可以看出每層網路起到甚麼作用，例如語音合成中可能會某一層的聲音為男生，下一層變成女生，就能知道這層網路提供了說話者的資訊，又或是在cnn中提取某些layer可以發現生成的圖案會是斜線或是某種pattern，已得知filter是依據什麼圖形判斷，盡量解讀模型，而不是把模型當作一個黑盒子。 - #### 增強式學習(RL) 透過environment與actor的互動，不斷更新資料，再利用得到的資料下去做gradient descent，與一般模型最大的差異是蒐集資料的階段也包含在迭代中，並且在蒐集資料時需要採用sample的方式，若隨機性不夠，資料就會過於偏頗蒐集不到較為極端的情況。另外在設計loss時，會根據每個action有不同的係數A，A設計的方式即為RL的重點，為了不讓機器短視近利，計算應該採用累加的獎勵，前面的action的獎勵應該要包括後續發生的情況，才能使機器有大局觀。進行RL訓練時需要設計一個計算critic的網路V，目的是為了求出給定一個(s,a)時，能預測出相對應的累加獎勵，主要分為兩種方式MC(monte-carlo)與TD(tempral-difference) 。MC比較重視每個動作之間的連結，訓練方式就是給予s,a,A，TD則考慮了總體的平均，把每個行為視為獨立的，訓練上的輸入還會考慮s~t+1~，可以得到關係式V(s) = γV(s~t+1~) + r~t~ ，輸入當前與下一個狀態並乘上γ，差值越接近r~t~(當前獎勵)越好。因此計算時兩種會有些微的不同。最終在計算A上為了考慮當前行為造成的累加獎勵的平均，以及當前狀態所有可能的累加獎勵平均得到A~t~ = r~t~ + V(s~t+1~) - V(s)，已得知當前的狀態採取的行為是否高於平均水準。 imitation learning : 人類示範給機器學習 reward shaping : 處理spare reward，如alpha go DQN - #### 機器終生學習(RL) 訓練機器可以學會不同domain的任務(實際上性質不會相差太遠)，依序學習前面的任務因為參數變化很大容易被忘記，為了不每一次訓練一個新的任務還要把前面的任務抓出來複習(資料量過於龐大)，因此需要終生學習的技術。 [mnist資料集訓練](https://github.com/Narticleo/PythonPractice/blob/master/pyTorch/MNIST/mnist.py) --- > **ALTHEA** ``` ML(14 ~ 18) GAN、circle GAN 了解是如何透過 discriminater 與 generator對抗式進行deep learning 練習pytorch的資料引入 tensorboard 練習 ``` --- > **GOOD** ``` 生成式網路 self-supervise learning ``` ### 專題想法 >**影像辨識食材，並生成食譜及食物照** >[形容生成的食物、生成實物照片](https://www.chatgpt-prompts.net/food-photography-prompts-ai-generated-food/) >[使用深度CNN辨識蔬果](https://ndltd.ncl.edu.tw/cgi-bin/gs32/gsweb.cgi/ccd=KYF2c./search?s=id=%22102NCNU0392017%22.&searchmode=basic) >[應用深度學習於綠葉蔬菜辨識之研究](https://ndltd.ncl.edu.tw/cgi-bin/gs32/gsweb.cgi/login?o=dnclcdr&s=id=%22110CCU01396019%22.&searchmode=basic) >[論文2020: CookGAN: Meal Image Synthesis from Ingredients](https://openaccess.thecvf.com/content_WACV_2020/papers/Han_CookGAN_Meal_Image_Synthesis_from_Ingredients_WACV_2020_paper.pdf) >[論文2023: FIRE專為食譜生成訂製之模型](https://arxiv.org/abs/2308.14391) >[論文2023: 根據圖片或提示字生成食譜之模型](https://arxiv.org/abs/2308.04579) >[cube](https://deepcube.igb.uci.edu/) >[lie](https://ndltd.ncl.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dnclcdr&s=id=%22109CCU00442026%22.&searchmode=basic&extralimit=asc=%22%E5%9C%8B%E7%AB%8B%E4%B8%AD%E6%AD%A3%E5%A4%A7%E5%AD%B8%22&extralimitunit=%E5%9C%8B%E7%AB%8B%E4%B8%AD%E6%AD%A3%E5%A4%A7%E5%AD%B8) >[Diffusion Model 論文研究](https://hackmd.io/@Tu32/B1-m6Tuai) --- ### 開會討論 ``` 比GAN更好的diffusion model 增強式學習專題 https://nv-tlabs.github.io/gameGAN/ 找專題: https://scholar.google.com.tw/ https://ieeexplore.ieee.org/Xplore/home.jsp cvpr github ``` --- ### 下周目標 ## 8/30~9/6 ### 學習內容 >**YUKI** ``` Self-attention Transformer GAN Self-supervised Learning Auto-encoder ``` > **CURRY** ``` ``` > **ALTHEA** ``` Self-supervise Learning 化可以透過填空訓練，完成NLP 問題．圖像填空．聲音填空 Auto-encoder Self supervise learning 的一種方法輸入經reconstruction要越接近越好化繁為簡，去觀測圖片的變化 De noising auto encoder == BERT Feature Disentangle 對vector的維度做分別．Voice Conversion 擬聲 (聲音語言) ．VQVAE 如果把embedding 變 words 做摘要中間加 discriminator 若把 tree 做 embedding Auto encoder 可以做壓縮會 Lossy Auto encoder 做 one class 分類．真假人臉分類 ``` > **GOOD** ``` reinforcement learning auto-encoder ``` --- ### 開會討論 **小組會議** 影像產生聲音【會不會太難?】花蓮一日遊路線生成【把範圍限縮在花蓮，縮小資料集數目】盲人打排球 GameGan自動駕駛【車禍判定，駕駛行為】開會時間: 一二四 19:00 **lab 會議** defussion model 看不懂 NLP 旅遊問題 transform learning 會太簡單要暑期統整，提出覺得可以做的，提出大概要做甚麼PPT。去確認一下GameGAN的資料量，想一下行車紀錄器的資料來源 https://nv-tlabs.github.io/gameGAN/ 查一下圖片生成聲音的論文 ### 現在主要的專題盲人辨識 (經費問題) // 食材食譜 (difussion model 有點難懂) // 魔術方塊 (有人做過，可以試試用不同的 model 做做看) 旅遊路線生成 (transform learning 太簡單) // GameGan自動駕駛 (增加自動駕駛的準確度，紅外線數據量很大，影像處理快不準確) --- ### 目前專題 * ==滑板姿勢偵測== * ~~用人像判定人別，用於人流分析~~ * 瓜類紋路甜度分析 * ~~戰旗model~~ * ==圖片與圖片延伸融合 (應用)== * ~~盲人打排球~~ * ~~盲人辨識~~ * ==食材食譜== * ~~魔術方塊~~ * ~~旅遊路線生成~~ * ~~GameGan自動駕駛~~ * ~~基於影像辨識的局部即時天氣預測~~ * 可自動偵測可疑行為之監視系統 * 面部微表情測謊系統 https://codi-gen.github.io/ [cvpr](https://github.com/amusi/CVPR2023-Papers-with-Code#CLIP) 方向方法目的成品 dataSet 功能或找想做的論文 ### 下周會議 [去看論文](https://scholar.google.com.tw/schhp?hl=zh-TW&as_sdt=0,5) 論文架構本周進度 ## 9/20~9/26 ### 本周進度 ==請針對以上題目做相關資料搜索== > **YUKI** 丟資料生食譜 [RecipeGPT: Generative Pre-training Based Cooking Recipe Generation and Evaluation System](https://arxiv.org/pdf/2003.02498.pdf) 1. [Allrecipes](https://www.allrecipes.com/) 2. [Yummly](https://www.yummly.com/) 3. pre-trained transformer GPT-2 [Ratatouille: A tool for Novel Recipe Generation](https://arxiv.org/pdf/2206.08267.pdf) 看食物圖產出對應的烹飪步驟(下面兩個蠻像的還沒細看) [Structure-Aware Generation Network for Recipe Generation from Images](https://arxiv.org/pdf/2009.00944.pdf) [Learning Structural Representations for Recipe Generation and Food Retrieval](https://arxiv.org/pdf/2110.01209.pdf) 垃圾分類 [A Novel Framework for Trash Classification Using Deep Transfer Learning](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8930948) [A Deep Trash Classification Model on Raspberry Pi 4](https://cdn.techscience.cn/ueditor/files/iasc/TSP_IASC-35-2/TSP_IASC_29078/TSP_IASC_29078.pdf) [A Systematic Review of Machine Learning Approaches for Trash Classification](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10125688) > **CURRY** [ViT(vision transformer)](https://www.youtube.com/watch?v=FRFt3x0bO94) > **ALTHEA** [食物與熱量分析] (https://www.sciencedirect.com/science/article/abs/pii/S0308814622012055)(別組的題目??) [雞的性別公開資料庫](https://github.com/PuristWu/Identifying-gender) [雞的相關論文](https://www.mdpi.com/1099-4300/22/7/719) **其他論文想法** * 人類行為分析 * 透過影片轉3D立體技術演一場戲 (在3D模型上加妝髮) * FB 字跡模仿 (自督式學習) * 透過文字生成臉 * 模仿書法家 * 走路姿勢判別 * 影像辨識智慧垃圾桶 **一些有趣的論文** 圖片轉3D模型論文 [I2L-MeshNet: Image-to-Lixel Prediction Network for Accurate 3D Human Pose and Mesh Estimation from a Single RGB Image ](https://arxiv.org/abs/2008.03713) 在人體上增加動作點 [3D Human Mesh Estimation from Virtual Markers](https://openaccess.thecvf.com/content/CVPR2023/papers/Ma_3D_Human_Mesh_Estimation_From_Virtual_Markers_CVPR_2023_paper.pdf) [3D Human Mesh Estimation from Virtual Markers **YT video**](https://www.youtube.com/watch?v=je2gNUiYl2c) > **GOOD** Garbage Classification [2020: An Automatic Garbage Classification SystemBased on Deep Learning](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9144549) [2020: WasNet: A Neural Network-Based Garbage Collection Management System](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9107232) [2021: Using YOLOv5 for Garbage Classification](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9550790) [2021: A Novel Intelligent Garbage Classification System Based on Deep Learning and an Embedded Linux System](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9543664) [2023: Garbage Content Estimation Using Internet of Things and Machine Learning](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10036411) [高二](https://twsf.ntsec.gov.tw/activity/race-1/59/pdf/NPHSF2019-052508.pdf) #### 小組會議 topic: 待定 require: 1. 影像辨識垃圾 2. 聲音辨識垃圾 3. 其他(容量監測、自動開啟、統計) method: 1. 影像辨識 2. 聲音辨識 goal: 1. 透過實踐深度學習資源回收分類，讓回收前端作業分類更確實 2. 減低複雜資源回收知識教育普及的成本 vision: 1. 增加回收分類細項，如塑膠類型、玻璃顏色，實現大型回收場後端分類自動化 #### 會議 1. 去找2020的論文，有做過相關的(提出不一樣的點) 2. 學3D畫圖，把設計圖畫出來(sketchup) 3. 預計進度(甘特圖) 4. 預算估計 5. 要看雙輸入模型(影像聲音) 6. 報告準備 7. 準備好被問問題下周二7點線上會議下周三7點 ## 9/27~10/2 ### 學習內容 >**YUKI** SketchUp粗略建模 [Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification](https://ieeexplore.ieee.org/abstract/document/7829341?casa_token=ryEErxH-lEIAAAAA:LDe8YewLgsKCURic-_jV-Mbm72K-kYtLcEvZFDmyv88PDjZP09Cwal4ikvW1BFrt-T56P-_HhQ) CNN影片複習 > **CURRY** [CNN快速複習](https://www.youtube.com/watch?v=AFlIM0jSI9I) [神經網路壓縮](https://www.youtube.com/watch?v=xrlbLPaq_Og) > **ALTHEA** [CNN enviroment sound classfication](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7829341) [ unified CNN-RNN framework for classification](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9717998) [audio data argumentation](https://towardsdatascience.com/audio-deep-learning-made-simple-part-3-data-preparation-and-augmentation-24c6e1f6b52) > **GOOD** ## 10/11~10/17 ### 學習內容 >**YUKI** [trashdataset1](https://www.kaggle.com/datasets/masterofdeception/trash-image-and-pretrained-model/data) [trashdataset2](https://www.kaggle.com/datasets/asdasdasasdas/garbage-classification/data) > **CURRY** > **ALTHEA** > **GOOD** ## 10/18~10/24 ### 學習內容 >**YUKI** dataset 垃圾照片約25張看網路上的簡易CNN模型code [Convolution Neural Network](https://medium.com/%E9%9B%9E%E9%9B%9E%E8%88%87%E5%85%94%E5%85%94%E7%9A%84%E5%B7%A5%E7%A8%8B%E4%B8%96%E7%95%8C/%E6%A9%9F%E5%99%A8%E5%AD%B8%E7%BF%92-ml-note-convolution-neural-network-%E5%8D%B7%E7%A9%8D%E7%A5%9E%E7%B6%93%E7%B6%B2%E8%B7%AF-bfa8566744e9) > **CURRY** https://github.com/salesforce/ALBEF > **ALTHEA** [音訊前處理](https://towardsdatascience.com/audio-deep-learning-made-simple-sound-classification-step-by-step-cebc936bbe5) https://urbansounddataset.weebly.com/download-urbansound8k.html > **GOOD** ### 小組會議 1020 1. 對音訊清洗做search 2. 拍攝50張dataset ## 10/25~10/31 ### 學習內容 >**YUKI** dataset 垃圾照片約50張 > **CURRY** > [海豚鯨魚個體辨識](https://www.kuroshio.org.tw/newsite/article01.php?class_subitem_id=351) > **ALTHEA** 收集50張回收資料 > **GOOD** ### 11/14 開會 * (如果)改專題題目要給一個解釋，交代疑問: 確認 dataset 的樣子 label? 數量? 目前的困難? 如果沒有label的話model?? DL cluster unsupervise 資料庫的狀況與需求明天確認需求，不要決定，再討論 * 原本的題目 https://ieeexplore.ieee.org/abstract/document/9190246 albef code太難我們有點難改 ## 10/25~10/31 ### 學習內容 >**YUKI** Unsupervised Learning K-means 演算法 > **CURRY** [Individual common dolphin identification via metric embedding learning](https://arxiv.org/abs/1901.03662) > **ALTHEA** [findfinR](C:\Users\althea\Desktop\dolFIN) https://www.frontiersin.org/articles/10.3389/fmars.2022.849813/full#h4 [dataset](https://www.ncei.noaa.gov/archive/archive-management-system/OAS/bin/prd/jquery/accession/download/254384) > **GOOD** ## 11/21~11/28 ### 學習內容 >**YUKI** > **CURRY** ## 專題可用技術 #### data augmentation Perspective Transformation(透視變換) ![image](https://hackmd.io/_uploads/B1utwdyHp.png) ![image](https://hackmd.io/_uploads/BkCiPOyrT.png) Biomimetic Transformation(仿生變換) ##### 關鍵點演算法: orb,sift,rootsift,superpoint superpoint雖有用於鯨豚、魚類，但較擅於辨識邊角邊緣，對花紋海豚複雜的紋路較不起作用。 ## 參考論文 #### [Finding Nemo’s Giant Cousin: Keypoint Matching for Robust Re-Identification of Giant Sunfish](https://www.mdpi.com/2077-1312/11/5/889) 發表時間:2023/4/19 對圖片進行關鍵點查詢，設定閥值，超過一定數量的匹配為同一隻，但每一隻都會對所有的圖片做比對，複雜為O(n²) ，較不適用在大型數據集。把匹配的問題變成二元判斷。 ##### 關鍵點演算法: orb,sift,rootsift,superpoint superpoint雖有用於鯨豚、魚類，但較擅於辨識邊角邊緣，對花紋海豚複雜的紋路較不起作用。 #### [Membership Inference Attack for Beluga Whales Discrimination](https://arxiv.org/abs/2302.14769) 發表時間:2023/2/28 利用會員推斷攻擊(membership interference attack)的方法來判別輸入是否屬於訓練集，雖然能判別新的個體，但使其加入訓練集就要重新訓練整個模型。 #### [Re-identification of fish individuals of undulate skate via deep learning within a few-shot context](https://www.sciencedirect.com/science/article/pii/S1574954123000651) 發表時間:2023/7/15 用siamese network進行訓練(並不能有效解決open set的問題) 用InceptionResNetV2對ImageNet做預訓練一般的data augmentation > **ALTHEA** 資料集現況整理 * 3000張部分 label 的照片 (幾乎不可用，label 的檔案太差，需要大量人力調整) * 下下週會有110知各提的資料集(照片+csv) (可用) * 有切割好的背鰭對應圖片的jason > **GOOD** > Data preprocessing(temporarily): >>-> Robofollow (label data manually) >>-> Yolo (be trained) (be modified to which return coordinates of bounding box) >>-> SAM (be modified to which return desire formate of result) >>-> Result (no background/'0' pixel background) ## 11/29~12/5 ### 學習內容 >**YUKI** [Towards Automatic Cetacean Photo-Identification: A Framework for Fine-Grain, Few-Shot Learning in Marine Ecology](https://ieeexplore.ieee.org/abstract/document/10020942?casa_token=mi8w8wH_KRkAAAAA:j2WY9MD0cGGnkcIKSYYltIfwL-V0Fox-8Xz3ZxsEYd4h73vfH-4QclAsjvgtLk4TiycGUBin) 可能有用也蠻新的 1. Mask R-CNN偵測背鰭 2. Siamese Network 3. triplet loss [Re-identification of fish individuals of undulate skate via deep learning within a few-shot context](https://www.sciencedirect.com/science/article/pii/S1574954123000651) 1. [code](https://github.com/nuriagomv/Application-of-Deep-Learning-techniques-for-the-photo-identification-of-fish-individuals) 2. 方法最後面提到不能用“top-k accuracy” 3. 資料增強的方法可以參考 > **CURRY** > 資料前處理:去背，加強對比(兩次)，刪去過暗部分，隨機透視變換後隨機旋轉 ![image](https://hackmd.io/_uploads/Hke13ciHa.png) ![image](https://hackmd.io/_uploads/S14encjBp.png) ![image](https://hackmd.io/_uploads/H1sB2coH6.png)![image](https://hackmd.io/_uploads/HkNv2qoHa.png) [code of preprocess](https://colab.research.google.com/drive/1tcfuA459zIcbnrCGzyxFgIstqpK5Ec6s#scrollTo=2qms-y1s5S_O) > **ALTHEA** ##### 第一次嘗試這是第一次用來train yolo v8 的測試(別人框好的資料集) * [dolphin dorsal fin dataset on roboflow](https://universe.roboflow.com/iurii-sobolev/whales-1nzxh) * [參考影片 roboflow on yolo v8](https://www.youtube.com/watch?v=wuZtUMEiKWY) train 出一個垃圾 [colab yolo v8 trained model finished](https://colab.research.google.com/drive/1MZQBQO6zpjuM_WRiwDp0gi5G1zUlYFam?usp=sharing) ![image](https://hackmd.io/_uploads/rJmXcUqBT.png) ![image](https://hackmd.io/_uploads/SkHVcLcST.png) ![image](https://hackmd.io/_uploads/B1nSc8cST.png) ##### 第二次嘗試 * [dolphin dorsal fin annotated dataset 2](https://universe.roboflow.com/theo-lg/projet-fouille-de-donnees/dataset/3/images) * [result]() 好一點了，但還是train一個垃圾... ![image](https://hackmd.io/_uploads/SkxGCljSp.png) ![image](https://hackmd.io/_uploads/rJrhRgorT.png) 因為上面那個東西成功率低到沒法用，且跟我們預選的眶不太一樣所以我們要自己 annotation [參考影片](https://www.youtube.com/watch?v=m9fH9OWn8YM) > **GOOD** 測試一下SAM在切割背鰭的效果，感覺沒有很好 ![下載](https://hackmd.io/_uploads/BJ2Ci_3ra.png) ![下載 (1)](https://hackmd.io/_uploads/Syy13uhSp.png) deep metric learning應該是我們要做的東西，比較新的相關論文: https://ieeexplore.ieee.org/document/9897167 https://ieeexplore.ieee.org/document/9897939 efficientnet b0 + triplet loss/semi-hard triplet loss/hard triplet loss(未完成 ## 12/6~12/12 ### 學習內容 >**YUKI** [labelGo標記輔助](https://blog.csdn.net/fengdu78/article/details/130479351) [Labelimg](https://blog.gtwang.org/useful-tools/labelimg-graphical-image-annotation-tool-tutorial/) > **CURRY** > **ALTHEA** * tool: semi auto annotation: CVAT * labelbox: ml assisted labeling(python) [labeling tips](https://medium.com/@frozenfung/%E6%95%B8%E6%93%9A%E6%A8%99%E8%A8%98%E5%9C%A8%E6%A9%9F%E5%99%A8%E5%AD%B8%E7%BF%92%E4%B8%AD%E7%9A%84%E8%A7%92%E8%89%B2-%E4%B8%8B-1675274efe59) (以上我大概找了10幾個只有labelbox有可能可以 assist annotation 矩形) 以上的工具都只有seg bounding box 的AI辨識，或需要用的程式碼做自動偵測，考慮到我們的其實也不需要超大量的資料來訓練detector，評估實作與學習時間，我們決定手動標註(by cvat) * 周五前每人標150張，周末訓練 ![image](https://hackmd.io/_uploads/Bk_9Kk88a.png) > **GOOD** pretrained NN + hardesttripletloss + t-SNE finish ## 12/13~12/20 ### 學習內容 >**YUKI** 協助資料Label > **CURRY** 資料label **處理重疊照片:** [rembg延伸](https://colab.research.google.com/drive/1uWmzH8kX7QlnKeEBDOhntJ8fgGI315Il#scrollTo=f8vBDRwExUgr) 先做一些邊緣銳化處理可以確保中心物件有被保留，但同時也會保留其他重疊物件 [SAM切割](https://colab.research.google.com/drive/1AhrHZM6_mqvuZFigmTefZHV1jRoeXTNu#scrollTo=zTGeu8skhx71) 可以選取圖片中的物件，設定取圖片中心的物件就可以成功只保留目標物件，帶切割品質較差，可能無法保留輪廓資訊。 > **ALTHEA** 1. 購買硬碟 (發票還沒寄到，不能報帳我要吃土了) 2. 下載標記資料到硬碟中使用CVAT、labelimg annotate data 用YOLO v8 train 模型 * 遇到了很多困難.... (yaml路徑失蹤、name/main問題) * 正在train 中，結果好像是一個寂寞可能可以改進 * 調整batch size, epoch * 使用val dataset 觀察訓練狀況 * train 的方法有問題 > **GOOD** > [triplet](https://colab.research.google.com/drive/1BFC2-ZeDdEyUItPGWTs1SKfIaCrIZCu3) ## 12/21 ~ 12/26 ### 學習內容 >**YUKI** train:136 test:26 valid:26 [detect](https://colab.research.google.com/drive/1JzXqRcDTYX3Aekk_zIFgptg75grLX5Bq#scrollTo=lMLBewLNR5ZA) > **CURRY** > **ALTHEA** loss nan 的原因 (我的外顯就是GTX16XX系列): https://blog.csdn.net/weixin_42283539/article/details/129988766 > **GOOD** [main](https://colab.research.google.com/drive/1W2hXNy6_qJBDSg76VcDmqtL1PIBEAb9H) #### 12/26開會 yolo v8 用實驗電腦跑跑看(core dumped) triplet loss ##### 下周進度把yolo弄好把左右被其手動分好完成data generator ## 1/21開會 ### 下周進度要講 SQL 的架構 for 劉做一個網頁模板王柏霖明天 train 好，周四前組合周四前 YOLO 修好周二前吳(cite)跟羅(introduction)用報告羅跟劉收集distractor 找網路上的備齊資料 ## ~ 1/23 ### 學習內容 >**YUKI** YOLO train 資料標記並train ->效果好像差不多原本的問題沒完全解決學習資料庫概念 ->mySQL語法練習 > **CURRY** 暫時還沒有處理精準去背，所以先人工清理資料夾手動去除去被失敗照片，整理出兩個訓練集，原始照片以及去背照片 raw: ![image](https://hackmd.io/_uploads/HJD9qf6Yp.png) rmv: ![image](https://hackmd.io/_uploads/S1psqMpYT.png) 去背失敗資料夾(之後可以測試新的去背方法) ![image](https://hackmd.io/_uploads/r15PjMaY6.png) ![image](https://hackmd.io/_uploads/H1MDhGaFT.png) 去背失敗比率約為11% 處理驗證方法以及dataloader dataloader: 根據個體放入相對應的資料夾 ![image](https://hackmd.io/_uploads/B16l8ZaFa.png) 均等切成10個fold ![image](https://hackmd.io/_uploads/Syzi8ZTYp.png)![image](https://hackmd.io/_uploads/Bydn8Z6tT.png) 第10個fold不進模型拿來testing 剩下9個fold交叉驗證 1個epoch有9個round，第i round時，fold i作為驗證集其他為訓練集 ex.round 5:traing set:(1,2,3,4,6,7,8,9),validation set:(5) ![image](https://hackmd.io/_uploads/BkPJ_bpFa.png) 每一個round有2000(暫定)個batch 一個batch = catalog*num ex. catalog = 5, num = 3 A1 A2 A3 B1 B2 B3 C1 C2 C3 D1 D2 D3 E1 E2 E3 每張照片都與自己的其他照片搭配一次(ALL),負樣本另外挑 A1->A2,A3為正樣本 A2->A1,A3為正樣本 A3->A1,A2為正樣本 B1->...... (num-1)catalog<sup>2</sup>/2 對三元組 validation method: 輪流取一個fold作為驗證集 ![image](https://hackmd.io/_uploads/HJMdPGatp.png) 做法: 每個id挑一些數量作為query(查詢目標),剩下的放到gallery(資料庫) ![image](https://hackmd.io/_uploads/ByS6Df6Fp.png) 參數: round:選哪一個fold作為驗證 shuffle:選定一個fold之後是否打亂 test_num:每個id要選幾張做為query,剩下的全部放gallery gallery_exception:如果query已經包含該id所有的照片，要隨機轉換幾張做為gallery 逐個檢視query，把gallery所有照片經過encoder之後與query算距離，經過排序檢查前5個最像的 ![image](https://hackmd.io/_uploads/r1rntGTFT.png) 計算top1,top3,top5正確率 > **ALTHEA** 1. 連上server train yolo (問題: 可能要直接放一堆尾鰭的照片加強資料) 3. 另外標記200張照片 4. 研究大專生計劃書架構，並在會議提出後續做法 5. 研究前端interface怎麼做，並在會議提出前端網頁語言: * Javascript * 使用 React 來建立使用者介面，包括圖片上傳按鈕、圖片預覽、結果展示等。 * 當使用者上傳圖片後，使用 TensorFlow.js 在前端處理這張圖片，執行深度學習模型來找到類似的圖片 * 數據交換: 使用 REST API 或 GraphQL。 * 運算: 雲端或本地server * 資料庫: * MySQL ![image](https://hackmd.io/_uploads/S16-2MaYa.png) ![image](https://hackmd.io/_uploads/SyxQnM6F6.png) > **GOOD** ## 1/24 ~ 1/30 ### 學習內容 >**YUKI** [1](https://dev.to/andreygermanov/how-to-detect-objects-in-videos-in-a-web-browser-using-yolov8-neural-network-and-javascript-lfb) [2](https://github.com/Hyuto/yolov8-tfjs) [Custom object detection in the browser using TensorFlow.js](https://blog.tensorflow.org/2021/01/custom-object-detection-in-browser.html) [用相機和Tensorflow.js在瀏覽器上實現目標偵測](https://www.jiqizhixin.com/articles/2018-05-02-2) [tfjs問題](https://github.com/ultralytics/ultralytics/issues/869) [害我搞很久](https://docs.ultralytics.com/zh/modes/export/) [轉檔](https://www.volcengine.com/theme/4285682-R-7-1) [React](https://ithelp.ithome.com.tw/users/20116826/ironman/2278) > **CURRY** docker commad: ![image](https://hackmd.io/_uploads/By3PvIL5T.png) ctrl + d 離開環境接下來處理圖片顯示問題 training result: ![image](https://hackmd.io/_uploads/Bk7XOLUc6.png) ![image](https://hackmd.io/_uploads/rJVD_8I5T.png) ![image](https://hackmd.io/_uploads/Bywp_LLq6.png) ![image](https://hackmd.io/_uploads/SJlXFILc6.png) > **ALTHEA** 與黑潮開會的簡報:https://www.canva.com/design/DAF61DolWRI/TH34scuU1JJlc97O4iEciw/edit?utm_content=DAF61DolWRI&utm_campaign=designshare&utm_medium=link2&utm_source=sharebutton 一些 YOLO 跟 Javascript 的網站 v1~3: https://github.com/CristianAbrante/YOLO-in-browser?tab=readme-ov-file new(web cam): https://www.youtube.com/watch?v=YSG_541LFng&ab_channel=MuhammadMoin v7: https://towardsdatascience.com/training-a-custom-yolov7-in-pytorch-and-running-it-directly-in-the-browser-with-tensorflow-js-96a5ecd7a530 v8: https://github.com/Hyuto/yolov8-tfjs 官網: https://www.tensorflow.org/js/tutorials?hl=zh-tw 國科會計畫書:https://docs.google.com/document/d/1NwtKKLs306YGXuU1aP3JI0T7gdJ11Ndu1XvjeGWMVKM/edit?usp=sharing > **GOOD** ## 12/6~12/12 ### 學習內容 >**YUKI** > **CURRY** > **ALTHEA** 我看的論文，可以參考寫進計畫: 這篇有用深度學習做里氏海豚(Risso’s dolphin)做一個軟體來辨識，也是使用相似度比對，但技術層面寫的超簡，沒什麼可以參考的 [論文](https://link.springer.com/chapter/10.1007/978-3-030-82024-4_12) > **GOOD** ## 3/12 ### 學習內容 >**YUKI** 數據集 ```python= #簡單 X, y = make_classification(n_features=2,n_redundant=0,n_classes=2, n_clusters_per_class=1,random_state=1,n_informative=2,n_samples=100) ``` ![image](https://hackmd.io/_uploads/HyrTaBa6T.png) ```python= #調整'perplexity'(10 , 20 , 30 , 40 X_tsne = manifold.TSNE(n_components=2, init='random', random_state=5, verbose=0,perplexity = 10).fit_transform(X) ``` ![image](https://hackmd.io/_uploads/ryaFAB6Tp.png) ![image](https://hackmd.io/_uploads/SkW_Rr6pT.png) ![image](https://hackmd.io/_uploads/SktG1UTaT.png) ![image](https://hackmd.io/_uploads/ryfV1Lp6T.png) ```python= #調整'init'在perplexity=10,30,40(random , pca X_tsne = manifold.TSNE(n_components=2, init='random', random_state=5, verbose=0,perplexity = 30).fit_transform(X) ``` ![image](https://hackmd.io/_uploads/HkK4lIaTa.png) ![image](https://hackmd.io/_uploads/S1x7gLppp.png) ![image](https://hackmd.io/_uploads/ByLPeI6aa.png) ```python= #使用手寫數字集(1083,64 digits = load_digits(n_class=6) X, y = digits.data, digits.target X_tsne = manifold.TSNE(n_components=2, init='random', random_state=5, verbose=0,perplexity = 30).fit_transform(X) X_tsne = manifold.TSNE(n_components=2, init='random', random_state=5, verbose=0,perplexity = 10).fit_transform(X) X_tsne = manifold.TSNE(n_components=2, init='random', random_state=5, verbose=0,perplexity = 40).fit_transform(X) X_tsne = manifold.TSNE(n_components=2, init='pca', random_state=5, verbose=0,perplexity = 30).fit_transform(X) ``` ![image](https://hackmd.io/_uploads/HJRlR8TT6.png) ![image](https://hackmd.io/_uploads/B1WyJPaaT.png) ![image](https://hackmd.io/_uploads/H1izkDTpT.png) ![image](https://hackmd.io/_uploads/ByWUyD66p.png) ```python= #隨便建的(效果不好 X , y = make_classification(n_features=128,n_redundant=0,n_classes=4, n_clusters_per_class=5,random_state=1,n_informative=128,n_samples=100) X_tsne = manifold.TSNE(n_components=2, init='pca', random_state=5, verbose=0,perplexity = 30).fit_transform(X) X_tsne = manifold.TSNE(n_components=2, init='random', random_state=5, verbose=0,perplexity = 30).fit_transform(X) ``` ![image](https://hackmd.io/_uploads/ryt4ev666.png) ![image](https://hackmd.io/_uploads/rk3BewTT6.png) > **CURRY** 處理新資料:目前進度大約1/5 ![image](https://hackmd.io/_uploads/H1t_e3a66.png) 處理訓練完的結果: ![image](https://hackmd.io/_uploads/ryU-Z3a66.png) ![image](https://hackmd.io/_uploads/SJsrZhTpT.png) ![image](https://hackmd.io/_uploads/Hk5Ub3T6p.png) > **ALTHEA** 因為新資料集的海豚大小跟之前的資料及不太一樣舊MODEL跟沒辦法好好的偵測新資料的物件所以重新標記190張新資料，重Train Model ![results](https://hackmd.io/_uploads/rkgLCsTT6.png) 預測新資料集 ![HL20180709_01_Gg_063_WL_SAMM6724](https://hackmd.io/_uploads/HkVtRoaT6.jpg) ![HL20100730 _ 02_gg 015 IMG_8873 a](https://hackmd.io/_uploads/By8cRo6aT.jpg) 二次驗證吳文彰整理的資料接下來要去用APP的優化 > **GOOD** 修改程式碼，以及增加數據圖測試在相同超參數上模型對正確率的影響(初步測試程式碼還未完成無數據表) Resnet50(pretrained)+三層全連結層->top5正確率從一開始就有70，後面有升到90 Resnet50(no pretained)+三層全連結層->top5正確率從一開始60，到後面有80 三層全連結層->too5正確率從一開始60，到後面3 ## 3/19 ### 學習內容 >**YUKI** > **CURRY** ![image](https://hackmd.io/_uploads/H19Dn1PA6.png) 手動標記告一個段落，自動化中。 > **ALTHEA** ![image](https://hackmd.io/_uploads/SyEno1P0T.png) +180張資料集訓練過程 200ephoc 因為結果不太好，所以我又重新標記150網路上隨便海豚的資料+200張新資料集的資料。我不知道為甚麼半夜訓練到一半一個ephoc變成20~30分鐘... 那時候GPU有占用MBi但沒有power??? 這次訓練到140就斷了 > **GOOD** ![image](https://hackmd.io/_uploads/SkVC2kvCp.png) 1.確認之前的程式碼是有問題的 2.Trainging微調後的程式碼 ## 4/16 ### 學習內容 >**YUKI** ![image](https://hackmd.io/_uploads/SkR9F0ieR.png) ![image](https://hackmd.io/_uploads/ryeYjYRoxC.png) ![image](https://hackmd.io/_uploads/B1r2FAilR.png) > **CURRY** 手動+自動把新資料及弄好告一段落剩切割去背跟分左右邊下次開會前可以加入資料集處理新模型dataloader以及acc的部分 > **ALTHEA** ![image](https://hackmd.io/_uploads/rJw5h0igA.png) > **GOOD** ![image](https://hackmd.io/_uploads/BJc29AogA.png) ![image](https://hackmd.io/_uploads/B1O6c0oxC.png) # 4/23 ### 學習內容 >**YUKI** ![image](https://hackmd.io/_uploads/SJlzyfBW0.png) 人工處理資料集 > **CURRY** > ![image](https://hackmd.io/_uploads/BkGVyzSWC.png) ![image](https://hackmd.io/_uploads/r15SJMSWC.png) ![image](https://hackmd.io/_uploads/Bk_I1zHb0.png) acc_with_threshold = 88% top1 = 90~92% top2 = 94~96% top3 = 94~96% > **ALTHEA** ![Ocean Day](https://hackmd.io/_uploads/ByqdHuEbC.jpg) 完成word到相關技術(之前計劃書寫的) 看懂簡報大概怎做 > **GOOD**