---
title: "Text-Driven Image Cropping with Deep Learning and Genetic Algorithm - Mazer"
tags: PyConTW2025, 2025-organize, 2025-共筆
---
# Text-Driven Image Cropping with Deep Learning and Genetic Algorithm - Mazer
{%hackmd L_RLmFdeSD--CldirtUhCw %}
<iframe src=https://app.sli.do/event/t99LBWnooL5db6Vyn2qwEQ height=450 width=100%></iframe>
:::success
本演講提供 AI 翻譯字幕及摘要,請點選這裡前往 >> [PyCon Taiwan AI Notebook](https://pycontw.connyaku.app/?room=1VMIfDPgdBdoYe4kJW7R)
AI translation subtitles and summaries are available for this talk. Click here to access >> [PyCon Taiwan AI Notebook](https://pycontw.connyaku.app/?room=1VMIfDPgdBdoYe4kJW7R)
:::
> Collaborative writing start from below
> 從這裡開始共筆
## Text2Focus - Multiobject
重點區域越多越好
沒被切到的重點 越多越好
加上 penalty 以免裁切區域直接是空的
## Pareto optimal - exhaustive
最初是產生多個解
然後sliding window窮舉
但有效能問題
## Genetic algo.
population -> evaluation by objective function -> elitism -> crossover -> mutation
- 怎麼交配、怎麼突變是可以根據domain knowledge去設定
### Genetic Algo. -YUME
作者自己設計的lightweight interruptble multiobjective
的algo. for Text2Focus
先挑選 pareto front往下繁衍loop
NSGA-II 適合用來參考
### Genetic algo. - experiment
拿Kaggle貓狗分類來跑實驗
使用基因算法的特性:
- Approximation (對於需要高精確度的場域, 如:醫療, 可能不適合)
- Stable Runtime
## 實作細節
無論窮舉還是genetic都會產生很多解(圖片)
會遇到記憶體問題
改用numpy.view而不是.copy(), 矩陣存x,y,w,h
- 圖片裁切可參考[2021pycon 用Python刻一個深度學習圖片重點裁切系統](https://tw.pycon.org/2021/en-us/conference/talk/209/)
Below is the part that speaker updated the talk/tutorial after speech
講者於演講後有更新或勘誤投影片的部份