ML 2021/09/09 - HackMD

# ML 2021/09/09 ## Goal 1. Code: Implement `TaskRunner`, a tool that can divide/schedule/run task to specific resource group ex: GPU. 2. Experiments: Try smaller T and simpler dataset on CelebA ## Code: TaskRunner Main Class: [task_runner.py](https://github.com/FrankCCCCC/ntk-generative/blob/sychou/sychou/label-space/task_runner.py) Usage Example: [run.py](https://github.com/FrankCCCCC/ntk-generative/blob/sychou/sychou/label-space/run.py) A simple example: ```python=1 from task_runner import TaskRunner def test_run(epoch :int, decay: str, gpu: int, dataset_size: int): import os import jax.numpy as np os.environ["CUDA_VISIBLE_DEVICES"] = f'{gpu}' print(f"Epoch: {epoch}, Decay: {decay}, Dataset Size: {dataset_size}, GPU: {gpu}") if __name__ == '__main__': config = { 'section-1': { # Each section would be executed sequentially. 'group-1': { # The groups under the same section would be executed concurrently 'Call': test_run, # Call can be either a function call or a command in string 'Param': { # The TaskRunner would list all kinds of combination of the parameters and execute them once 'decay': ['exp', 'anneal', 'static'], 'epoch': [100, 1000, 10000], 'dataset_size': [1000, 2000, 3000] }, 'Async': { # The task under the same group would be schedule to the resources by TaskRunner during runtime. 'machine': [0, 1], 'gpu': [0, 1] } }, 'group-2':{ # 'group-2' can be seem as another resource group that handle different task from 'group-1' during 'section-1' 'Call': 'ls', 'Param': { '': ['-l', '-a', '-la'] }, 'Async': { '': [] } } }, 'section-2': { 'group-1': { 'Call': 'ls', 'Param': { '': ['-a'] } } } } tr = TaskRunner(config=config) tr.run() ``` ## Experiments: Results We try to enhance the creativity of the generator by tuning the `train_t_rate`. Common Settings - Dataset: Original CelebA - Noise Size: 10 - Epoch: 10000 - T: 65536.0 * `train_t_rate` - Perturbation Method: None - Diag Reg: 1e-5 - target_distribution: single - Loss Type: $$ \min_{M} Cross \ Entropy \ Loss(1, (I - e^{\eta t K_{M+N, M+N}}) y_{M+N}) $$ ## Exp1. Batch Size 256, Dataset Size 256 - Batch Size: 256 - Dataset Size: 256 - `train_t_rate`: 0.0001, 0.001, 0.01, 0.1, 0.5, 1, 3, 6 ### Train_t_rate = 6 ![](https://i.imgur.com/ATYZsqM.png) ![](https://i.imgur.com/i1xnjGE.png) ![](https://i.imgur.com/uTZP5W8.png) ### Train_t_rate = 3 ![](https://i.imgur.com/Je3Keuz.png) ![](https://i.imgur.com/oEGM3Bj.png) ![](https://i.imgur.com/iF9pwyE.png) ### Train_t_rate = 1 ![](https://i.imgur.com/V4N8kzZ.png) ![](https://i.imgur.com/UeulDdK.png) ![](https://i.imgur.com/M9QCLAx.png) ### Train_t_rate = 0.5 ![](https://i.imgur.com/MexierW.png) ![](https://i.imgur.com/2GIYR1Y.png) ![](https://i.imgur.com/TnAHAjz.png) ### Train_t_rate = 0.1 ![](https://i.imgur.com/GclDidp.png) ![](https://i.imgur.com/KZst907.png) ![](https://i.imgur.com/S6RLPC4.png) ### Train_t_rate = 0.01 ![](https://i.imgur.com/0a78zOw.png) ![](https://i.imgur.com/IOjg8tO.png) ![](https://i.imgur.com/BA1Nc7N.png) ### Train_t_rate = 0.001 ![](https://i.imgur.com/xnh0fsK.png) ![](https://i.imgur.com/Rj6xHrh.png) ![](https://i.imgur.com/0KqwkR0.png) ### Train_t_rate = 0.0001 ![](https://i.imgur.com/8HJ4ARG.png) ![](https://i.imgur.com/dIr2qT4.png) ![](https://i.imgur.com/p77ccU0.png) ## Exp2. Batch Size 512, Dataset Size 512 - Batch Size: 512 - Dataset Size: 512 - `train_t_rate`: 0.0001, 0.001, 0.01, 0.1, 1, 4, 8, 16 ### Train_t_rate = 16 ![](https://i.imgur.com/GSjnlFl.png) ![](https://i.imgur.com/b7BKW0T.png) ![](https://i.imgur.com/8WmGLuu.png) ### Train_t_rate = 8 ![](https://i.imgur.com/1UcqnPa.png) ![](https://i.imgur.com/otBrE2W.png) ![](https://i.imgur.com/EdKB5TX.png) ### Train_t_rate = 4 ![](https://i.imgur.com/3a0F3U9.png) ![](https://i.imgur.com/jZegNoU.png) ![](https://i.imgur.com/eDywGZ7.png) ### Train_t_rate = 1 ![](https://i.imgur.com/v8ZzVN0.png) ![](https://i.imgur.com/j5dQ9Jb.png) ![](https://i.imgur.com/WWfnyGp.png) ### Train_t_rate = 0.1 ![](https://i.imgur.com/ClnpbOx.png) ![](https://i.imgur.com/JSzVs1n.png) ![](https://i.imgur.com/AEYSFUV.png) ### Train_t_rate = 0.01 ![](https://i.imgur.com/SK8rbWN.png) ![](https://i.imgur.com/9zPePPc.png) ![](https://i.imgur.com/gCbU5WL.png) ### Train_t_rate = 0.001 ![](https://i.imgur.com/rMX2Wvn.png) ![](https://i.imgur.com/YIgnwty.png) ![](https://i.imgur.com/0jKguTy.png) ### Train_t_rate = 0.001 ![](https://i.imgur.com/ZyZBunA.png) ![](https://i.imgur.com/BjXIcKX.png) ![](https://i.imgur.com/Rq0tZDY.png)