--- title: "Text-Driven Image Cropping with Deep Learning and Genetic Algorithm - Mazer" tags: PyConTW2025, 2025-organize, 2025-共筆 --- # Text-Driven Image Cropping with Deep Learning and Genetic Algorithm - Mazer {%hackmd L_RLmFdeSD--CldirtUhCw %} <iframe src=https://app.sli.do/event/t99LBWnooL5db6Vyn2qwEQ height=450 width=100%></iframe> :::success 本演講提供 AI 翻譯字幕及摘要,請點選這裡前往 >> [PyCon Taiwan AI Notebook](https://pycontw.connyaku.app/?room=1VMIfDPgdBdoYe4kJW7R) AI translation subtitles and summaries are available for this talk. Click here to access >> [PyCon Taiwan AI Notebook](https://pycontw.connyaku.app/?room=1VMIfDPgdBdoYe4kJW7R) ::: > Collaborative writing start from below > 從這裡開始共筆 ## Text2Focus - Multiobject 重點區域越多越好 沒被切到的重點 越多越好 加上 penalty 以免裁切區域直接是空的 ## Pareto optimal - exhaustive 最初是產生多個解 然後sliding window窮舉 但有效能問題 ## Genetic algo. population -> evaluation by objective function -> elitism -> crossover -> mutation - 怎麼交配、怎麼突變是可以根據domain knowledge去設定 ### Genetic Algo. -YUME 作者自己設計的lightweight interruptble multiobjective 的algo. for Text2Focus 先挑選 pareto front往下繁衍loop NSGA-II 適合用來參考 ### Genetic algo. - experiment 拿Kaggle貓狗分類來跑實驗 使用基因算法的特性: - Approximation (對於需要高精確度的場域, 如:醫療, 可能不適合) - Stable Runtime ## 實作細節 無論窮舉還是genetic都會產生很多解(圖片) 會遇到記憶體問題 改用numpy.view而不是.copy(), 矩陣存x,y,w,h - 圖片裁切可參考[2021pycon 用Python刻一個深度學習圖片重點裁切系統](https://tw.pycon.org/2021/en-us/conference/talk/209/) Below is the part that speaker updated the talk/tutorial after speech 講者於演講後有更新或勘誤投影片的部份