owned this note
owned this note
Published
Linked with GitHub
# mk-options-for-auto-detection-of-cards
Below are four practical approaches, ordered from “almost-no-code” to “roll-your-own,” that people use to pull representative frames from a single-take overhead video when the object in shot stays still for a moment and then changes.
---
### 1. FFmpeg **`freezedetect` → snapshot**
`freezedetect` flags every span of frames whose pixel-to-pixel change falls below a threshold for at least *d* seconds.
Workflow:
```bash
# 1) Detect freezes and log their start times (-vsync 0 keeps original PTS)
ffmpeg -hide_banner -i cards.mp4 \
-vf "freezedetect=n=0.0005:d=0.4,metadata=print" \
-an -f null - 2> freeze.log
```
* `n` – how still a frame must be (0 = identical, try 0.0005 – 0.003).
* `d` – how long it must stay that still; 0.4-0.6 s is usually enough for the hand to leave.
`freeze.log` will contain lines like
```
[freezedetect @ ...] lavfi.freezedetect.freeze_start t:12.480 s:299
```
Parse those `t:` timestamps (bash, Python, etc.) and grab one frame a few 100 ms *after* the start so you skip any residual finger blur:
```bash
# example: t=12.480 → t+0.25
ffmpeg -y -ss 12.73 -i cards.mp4 -frames:v 1 -q:v 2 still_001.jpg
```
Because the filter is built into FFmpeg you get a purely command-line solution with no heavy dependencies. ([Super User][1])
---
### 2. FFmpeg **`mpdecimate` / `scene`** (single pass)
If your lighting is steady, you can instead drop near-duplicates and keep one frame each time the visual histogram jumps:
```bash
ffmpeg -i cards.mp4 \
-vf "mpdecimate,setpts=N/FRAME_RATE/TB,select='gt(scene,0.1)'" \
-vsync vfr -q:v 2 card_%03d.jpg
```
* `mpdecimate` removes runs of almost-identical frames.
* `select='gt(scene,0.1)'` keeps frames whose content differs from the one that survived just before.
Tune the `scene` threshold up or down so the *hand-in-motion* frames are skipped but the *new-card-settled* frame remains. ([FFmpeg][2])
Pro: one FFmpeg command; Con: you may need to experiment with thresholds on every lighting set-up.
---
### 3. **PySceneDetect** (Python CLI, no coding)
`pip install scenedetect`, then:
```bash
scenedetect -i cards.mp4 detect-adaptive -t 10.0 list-scenes \
split-video save-images
```
* `detect-adaptive` compares rolling frame-to-frame differences (works better than a fixed threshold when your hand causes big swings).
* `save-images` writes one JPEG per detected scene; grab the first or last image of each scene as your “card.”
Because it is a single command-line tool you avoid writing OpenCV, but you keep full control over thresholds (`-t`) and min-scene-length. ([scenedetect.com][3])
---
### 4. **20-line OpenCV script** (when you really need custom logic)
If the above still captures occasional hand frames, OpenCV lets you add extra logic—e.g. “ignore frames that contain skin-coloured blobs” or “wait until the frame-difference stays below threshold *and* no hand detector fires for N frames.” Pseudocode:
```python
cap = cv2.VideoCapture("cards.mp4")
prev = None; still = 0
while ret:
ret, frame = cap.read()
diff = cv2.absdiff(frame, prev) if prev is not None else None
motion = diff.mean() if diff is not None else 255
if motion < 2: # <-- tuned threshold
still += 1
if still == 10: # 10 consecutive quiet frames ≈ 0.33 s @30 fps
cv2.imwrite(f"card_{count:03d}.jpg", frame)
count += 1
else:
still = 0
prev = frame
```
OpenCV’s background-subtractor (`cv2.createBackgroundSubtractorMOG2()`), MediaPipe Hands, or a YOLO tiny hand model can replace the simple mean-diff line if you need higher precision.
---
## Choosing an approach
| Need | Best Option |
| -------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------- |
| “I want zero Python.” | **1** or **2** (pure FFmpeg) |
| Cross-platform GUI, threshold tweaking | **3** (PySceneDetect) |
| Reject every frame containing a hand, logo, glare, etc. | **4** (OpenCV) |
| Want the result already cut into a timeline for grading. | Run Detect Freezes in FFmpeg, then open the timestamps as markers in Resolve/FCPX, or use Resolve/Fusion scripting. |
All four methods can be chained into your Airtable/automated-logging workflow: once you know the timestamp for each captured still, you can write a small script to POST a row to Airtable with the card’s filename, the source video’s record ID, and any project metadata you need.
That should give you a spectrum of effort vs. control—pick the one that matches how much time you want to spend automating this step. Happy card-capturing!
[1]: https://superuser.com/questions/1313070/way-to-detect-frozen-video-with-ffmpeg?utm_source=chatgpt.com "Way to detect frozen video with ffmpeg? - Super User"
[2]: https://ffmpeg.org/ffmpeg-filters.html?utm_source=chatgpt.com "FFmpeg Filters Documentation"
[3]: https://www.scenedetect.com/docs/latest/api/detectors.html?utm_source=chatgpt.com "Detection Algorithms — PySceneDetect 0.6.6 documentation"