# Anata Male Optimization

## Summary
To summarize, here are the steps I ultimately took to create a safe environment for batch optimizing assets:
- Get stats on ALL glb files found in `files/male`
- `./scripts/get_stats.sh files/male KB`
- 977 rows found
- Generate male.html male_viz.txt male_viz.json male_viz.csv
- `python3 scripts/visualize_stats.py metadata/stats/male.csv -o all`
- Copy male assets + folder structure to `files/optimize/files/`
- `bash scripts/optimize/to_optimize2.sh metadata/stats/male_viz.csv`
- Create the markdown table with detailed stats about each category
- `python3 category_stats.py male_viz.json male_trait_stats.csv`
- Shows the sum, average for draw calls / triangles / file sizes
- Planning to add second JSON as argument for before / after columns ie: `python3 category_stats.py male_viz.json male_opti.json male_trait_stats.csv`
- `metadata/stats/male_viz.json` + `files/optimize/male_viz.json`
- Separate textures from glbs, saves as gltf + .bin + textures in same dir
- `find ./files -iname "*.glb" -exec sh -c 'gltf-pipeline -i "$1" -s -o "$(dirname "$1")/$(basename "$1" .glb).gltf"' _ {} \;`
- Get details about image textures belonging to all glbs
- `{ echo "path,resolution,kilobytes,bytes"; find ./files -type f -iname "*.jpg" -exec sh -c 'identify -format "\"%d/%f\",\"%wx%h\",\"%b\"\n" "$0" | awk -F, "{gsub(/[^0-9]/, \"\", \$3); printf \"%s,%s,%.2f,%.0f\\n\", \$1, \$2, \$3/1024, \$3}"' {} \;; } > male_jpg.csv`
- `{ echo "path,resolution,kilobytes,bytes"; find ./files -type f -iname "*.png" -exec sh -c 'identify -format "\"%d/%f\",\"%wx%h\",\"%b\"\n" "$0" | awk -F, "{gsub(/[^0-9]/, \"\", \$3); printf \"%s,%s,%.2f,%.0f\\n\", \$1, \$2, \$3/1024, \$3}"' {} \;; } > male_png.csv`
- **Backup** all images recursively while keeping the folder structure
- `find files/ -type f \( -iname "*.png" -o -iname "*.jpg" \) -exec tar -rvf images.tar {} +`
- Resize textures to nearest power-of-two resolution
- Preview: `python3 resize_textures.py path/to/your/file.csv -n`
- Always good to check / verify before running command, without `-n` it will overwrite files
- Resize: `python3 resize_textures.py path/to/your/file.csv`
- Compress PNG textures with 80% quality
- `find files/ -type f -name "*.png" -print0 | while IFS= read -r -d '' file; do pngquant --skip-if-larger --quality 80 --output "$file" --force "$file"; done`
- Generate another CSV with detailed stats after command finishes to compare results
- `{ echo "path,resolution,kilobytes,bytes"; find ./files -type f -iname "*.png" -exec sh -c 'identify -format "\"%d/%f\",\"%wx%h\",\"%b\"\n" "$0" | awk -F, "{gsub(/[^0-9]/, \"\", \$3); printf \"%s,%s,%.2f,%.0f\\n\", \$1, \$2, \$3/1024, \$3}"' {} \;; } > male_png2.csv`
- Pretty print results
- `python3 csv_analyze.py male_png2.csv > male_png2.txt`
- Compress JPEG textures with 80% quality
- `find files/ -type f -name "*.jpg" -print0 | while IFS= read -r -d '' file; do jpegoptim --max=80 --force "$file"; done`
- Generate CSV with detailed stats again
- `{ echo "path,resolution,kilobytes,bytes"; find ./files -type f -iname "*.jpg" -exec sh -c 'identify -format "\"%d/%f\",\"%wx%h\",\"%b\"\n" "$0" | awk -F, "{gsub(/[^0-9]/, \"\", \$3); printf \"%s,%s,%.2f,%.0f\\n\", \$1, \$2, \$3/1024, \$3}"' {} \;; } > male_jpg2.csv`
- Pretty print results
- `python3 csv_analyze.py male_jpg2.csv > male_jpg2.txt`
## scripts/optimize
Fixed `scripts/optimize/to_optimize.sh`, checks to use `File Path` first now from CSV

Used `bash scripts/optimize/to_optimize2.sh metadata/stats/male_viz.csv` and copied all the male assets to `files/optimize/files`. These files aren't in repo since this path is in `.gitignore`.
> Note: Since the folder directory structure / file path is the same as the main repo, can run the same `get_stats.sh` and `visualize_stats.py` to get the same data to use to compare with.
## scripts/analyze

Gathered all the glbs, here's some stats comparing to last month

### category_stats.py
`Usage: python3 category_stats.py male_viz.json male_trait_stats.csv`
male_viz.json contains data like such:
```jsonld!
[
{
"Category": "BRACE",
"Name": "Abstract_Vision_Brace_plus_Cursed_Brace",
"Size": 1058.57,
"Images": 2,
"Draw calls": 2,
"Triangles": 28,
"File Path": "files/optimize/files/male/BRACE/Abstract_Vision_Brace_plus_Cursed_Brace/Abstract_Vision_Brace_plus_Cursed_Brace.glb"
},
{
"Category": "BRACE",
"Name": "Abstract_Visions_of_Flame_Brace",
"Size": 3140.28,
"Images": 4,
"Draw calls": 4,
"Triangles": 10237,
"File Path": "files/optimize/files/male/BRACE/Abstract_Visions_of_Flame_Brace/Abstract_Visions_of_Flame_Brace.glb"
},
...
```
male_trait_stats.csv contains data like such:
```csvpreview
Category,Unique,Total,Percent
Body,1,1000,100
Brace,114,374,37
Clips and Kanzashi,8,19,2
Clothing,303,1000,100
Earring,61,197,20
Eyes,15,25,3
Face Other,47,140,14
Glasses,64,219,22
Hair,121,1000,100
Hair Accessory Other,2,4,0
Halos,28,129,13
Hats,11,17,2
Head,89,449,45
Head Accessory Other,42,66,7
Masks,15,47,5
Neck,61,208,21
Ribbons and Bows,1,1,0
Set,97,173,17
Sigil,95,435,44
Special Other,86,228,23
Tail,19,82,8
Tattoo,108,233,23
Type,35,1000,100
Weapon,100,516,52
Weapon Brace,30,104,10
Wings,47,126,13
```
`category_stats.py` combines these two and outputs reactjs HackMD style markdown for presentation

The JSON will update later during optimization steps, giving us fresh values per each category. Perhaps we can read from two JSON files to compare with then, one in `metadata/stats` and the other from `files/optimize` as **Before** and **After** columns.
Need a second JSON data source for the command, like so: `python3 category_stats.py male_viz.json male_opti.json male_trait_stats.csv`

We now have a script that uses the first JSON as "Before" and the second JSON as "After" containing data after we run our optimization scripts. Not tested yet though!
## Resize Textures
Notes here: https://hackmd.io/@XR/anata/%2FopsivFelTbCrkOthw-DUsQ#Textures
1. separate textures from glbs
2. create CSV with info about pngs
3. create CSV with info about jpgs
We now have a CSV file that looks like this snippet below:
```csvpreview
path,resolution,kilobytes,bytes
./files/male/TAIL/Siyokoy/tail_siyokoy.png,1024x1024,122.25,125187
./files/male/TAIL/Gray_Cat_Tail_Dual/Gray_Cat_Tail_Gray_Cat_Tail_BaseColor.1001.png,1024x1024,115.86,118642
./files/male/TAIL/Pink_Cat_Tail/tail_4_tail_4_BaseColor.1001.png,1024x1024,85.07,87115
./files/male/TAIL/Green_Tiger_Tail/green_Tiger_Tail.png,2048x2048,156.25,160004
./files/male/TAIL/Mermaid_Tail/blue_tail_2.png,1024x1024,688.86,705390
```
### Resize
> Prompt: I want to convert the values in the resolution column to the nearest power of two using imagemagick. If its 1000x1024 I want it to be 1024x1024. It's fine to overwrite the original PNG or JPG files. Please use double quotes because the PNGs may or may not contain spaces and/or special characters
**Sanitized Texture Data**
Many of the filenames have special characters or spaces which is tripping up the other scripts. Here is an improved one liner to generate the CSV that adds quotes around the filepath:
`{ echo "path,resolution,kilobytes,bytes"; find ./files -type f -iname "*.jpg" -exec sh -c 'identify -format "\"%d/%f\",\"%wx%h\",\"%b\"\n" "$0" | awk -F, "{gsub(/[^0-9]/, \"\", \$3); printf \"%s,%s,%.2f,%.0f\\n\", \$1, \$2, \$3/1024, \$3}"' {} \;; } > male_jpg.csv`
`{ echo "path,resolution,kilobytes,bytes"; find ./files -type f -iname "*.png" -exec sh -c 'identify -format "\"%d/%f\",\"%wx%h\",\"%b\"\n" "$0" | awk -F, "{gsub(/[^0-9]/, \"\", \$3); printf \"%s,%s,%.2f,%.0f\\n\", \$1, \$2, \$3/1024, \$3}"' {} \;; } > male_png.csv`
Can now get the basename for files despite how messy they are:
```bash!
filepath="./files/male/CLOTHING/Plug_Suit_Battle_Cloaked/man2 _Plug_suit_2_BaseColor.1001.png.004.png"
basename=$(basename "$filepath")
echo "$basename"
```
Should yield the output: `man2 _Plug_suit_2_BaseColor.1001.png.004.png`
I want to keep these consistent for now since that's what the glTF files are searching for in the JSON... but we can of course edit the glTF files now as well.

Tested recombining without changing anything into a glb to confirm it works without errors, so yes - even with all the spaces it's fine since there's double quotes.

### Backup
Just in-case, before working on stuff, here's a one-liner to backup all images recursively while keeping the folder structure: `find files/ -type f \( -iname "*.png" -o -iname "*.jpg" \) -exec tar -rvf images.tar {} +`
```
## Usage
## Preview: python3 resize_textures.py path/to/your/file.csv -n
## Convert: python3 resize_textures.py path/to/your/file.csv
```
### Previews
You want to be careful since it is destructive, it will overwrite the existing files, so this command allows us to preview operations before we execute.
> Should 3055x3013 upscale to 4096x4096?
Rounding to lowest power of two if >2048 and <4096 while keeping aspect ratio:

Sanitize texture filenames?: https://github.com/M3-org/anata/issues/77
Only 2 textures were not processed due to special characters, need to change them in the blend file. Everything else resized great!
**Before** (Green = Power of Two resolution)

**After**

Onto the next step!
---
## Texture Compression
We now move onto compressing the textures that we just resized. For this we'll use `pngquant` for PNG and `jpegoptim` for jpeg as these produced the best results in initial testing I did earlier.

The first pass will optimize the textures as is. If some textures do not make use of alpha transparency then we can consider converting such to jpeg in a second pass. We can worry about that later.
**First pass PNG compression**
```
Before:
Sum (KB): 577028.02
Average (KB): 379.37
Median (KB): 236.46
Minimum (KB): 0.29
Maximum (KB): 7827.18
After (default settings):
Sum (KB): 492233.93
Average (KB): 323.63
Median (KB): 195.23
Minimum (KB): 0.12
Maximum (KB): 4614.26
15% improvement
After (80% quality):
Sum (KB): 448954.15
Average (KB): 295.17
Median (KB): 195.18
Minimum (KB): 0.12
Maximum (KB): 2045.04
23% improvement
```
> Many of the PNG textures (1500+ for male) don't require alpha transparency though and can be converted to JPG (~550 atm), but it will require a way to inspect / mark which ones and IMO it's not worth right now because even before that we will want to bake into single textures to reduce draw calls. We can automate some of this through simplygon API, will also need a human in the loop to visually inspect the work which I made tools for
**Jpeg Compression**
```
Before:
Sum (KB): 239204.35
Average (KB): 429.45
Median (KB): 403.01
Minimum (KB): 11.19
Maximum (KB): 1802.60
After: 80% quality with jpegoptim
Sum (KB): 114924.40
Average (KB): 206.33
Median (KB): 189.64
Minimum (KB): 19.20
Maximum (KB): 910.69
```
**Total optimization stats**
```
2.8G images.tar
2.2G images_resized.tar
1.4G images_compressed_1.tar
1022M images_compressed_2.tar
801M images_compressed_jpg-png_80.tar
```
This is for just male, and first pass. So far so good