在 pre-trained model 內有三個物件
# Load pre-trained network.
url = 'https://drive.google.com/uc?id=1MEGjdvVpUsu1jB4zrXZN7Y4kBBOzizDQ' # karras2019stylegan-ffhq-1024x1024.pkl
with dnnlib.util.open_url(url, cache_dir=config.cache_dir) as f:
_G, _D, Gs = pickle.load(f)
# _G = Instantaneous snapshot of the generator. Mainly useful for resuming a previous training run.
# _D = Instantaneous snapshot of the discriminator. Mainly useful for resuming a previous training run.
# Gs = Long-term average of the generator. Yields higher-quality results than the instantaneous snapshot.
Gs.run()
可直接將 n 個 vector 輸入後輸出 n 張圖片,透過前面的種子設定輸出圖片的樣式
# Pick latent vector.
rnd = np.random.RandomState(5)
latents = rnd.randn(1, Gs.input_shape[1]) # n is 1 here
# Generate image.
fmt = dict(func=tflib.convert_images_to_uint8, nchw_to_nhwc=True)
images = Gs.run(latents, None, truncation_psi=0.7, randomize_noise=True, output_transform=fmt)
Gs.get_output_for()
取得 generator 做到一半的成果,可以直接接續給下一個 tensorflow 模型製作,不須再透過其他套件轉換圖片格式成 numpy array。
latents = tf.random_normal([self.minibatch_per_gpu] + Gs_clone.input_shape[1:])
images = Gs_clone.get_output_for(latents, None, is_validation=True, randomize_noise=True)
images = tflib.convert_images_to_uint8(images)
result_expr.append(inception_clone.get_output_for(images))
Gs.components.mapping()
將輸入轉到 Gs.components.synthesis()
把在
src_latents = np.stack(np.random.RandomState(seed).randn(Gs.input_shape[1]) for seed in src_seeds)
src_dlatents = Gs.components.mapping.run(src_latents, None) # [seed, layer, component]
src_images = Gs.components.synthesis.run(src_dlatents, randomize_noise=False, **synthesis_kwargs)
CelebA-HQ | FFHQ | |
---|---|---|
paper link | paper with code link | paper with code link |
dataset info | human face | human face |
image count | 30k images | 70k images |
image resolution | 1024 X 1024 | 1024 X 1024 |
to load | load with tensorflow | load from github |
Section 2.1 第三段最後,figure. 2 下面
We found these choices to give the best
results. Our contributions do not modify the loss function.
it's so boring to read this loss function 張議隆
python 3.6 (using conda)
tensorflow 2.6.2 (1.1.0 or newer)
numpy 1.19.5 (1.14.3 or newer)
DRAM 24GB (11GB or more)
NVIDIA driver version 520.56.06 (391.35 or newer)
CUDA 11.8 (8.0 or newer)
cuDNN 8.7.0.84 (7.3.1 or newer)
12/16 林凱翔張議隆
using 1024X1024 image from ffhq dataset
Gs Params OutputShape WeightShape
--- --- --- ---
latents_in - (?, 512) -
labels_in - (?, 0) -
lod - () -
dlatent_avg - (512,) -
G_mapping/latents_in - (?, 512) -
G_mapping/labels_in - (?, 0) -
G_mapping/PixelNorm - (?, 512) -
G_mapping/Dense0 262656 (?, 512) (512, 512)
G_mapping/Dense1 262656 (?, 512) (512, 512)
G_mapping/Dense2 262656 (?, 512) (512, 512)
G_mapping/Dense3 262656 (?, 512) (512, 512)
G_mapping/Dense4 262656 (?, 512) (512, 512)
G_mapping/Dense5 262656 (?, 512) (512, 512)
G_mapping/Dense6 262656 (?, 512) (512, 512)
G_mapping/Dense7 262656 (?, 512) (512, 512)
G_mapping/Broadcast - (?, 18, 512) -
G_mapping/dlatents_out - (?, 18, 512) -
Truncation - (?, 18, 512) -
G_synthesis/dlatents_in - (?, 18, 512) -
G_synthesis/4x4/Const 534528 (?, 512, 4, 4) (512,)
G_synthesis/4x4/Conv 2885632 (?, 512, 4, 4) (3, 3, 512, 512)
G_synthesis/ToRGB_lod8 1539 (?, 3, 4, 4) (1, 1, 512, 3)
G_synthesis/8x8/Conv0_up 2885632 (?, 512, 8, 8) (3, 3, 512, 512)
G_synthesis/8x8/Conv1 2885632 (?, 512, 8, 8) (3, 3, 512, 512)
G_synthesis/ToRGB_lod7 1539 (?, 3, 8, 8) (1, 1, 512, 3)
G_synthesis/Upscale2D - (?, 3, 8, 8) -
G_synthesis/Grow_lod7 - (?, 3, 8, 8) -
G_synthesis/16x16/Conv0_up 2885632 (?, 512, 16, 16) (3, 3, 512, 512)
G_synthesis/16x16/Conv1 2885632 (?, 512, 16, 16) (3, 3, 512, 512)
G_synthesis/ToRGB_lod6 1539 (?, 3, 16, 16) (1, 1, 512, 3)
G_synthesis/Upscale2D_1 - (?, 3, 16, 16) -
G_synthesis/Grow_lod6 - (?, 3, 16, 16) -
G_synthesis/32x32/Conv0_up 2885632 (?, 512, 32, 32) (3, 3, 512, 512)
G_synthesis/32x32/Conv1 2885632 (?, 512, 32, 32) (3, 3, 512, 512)
G_synthesis/ToRGB_lod5 1539 (?, 3, 32, 32) (1, 1, 512, 3)
G_synthesis/Upscale2D_2 - (?, 3, 32, 32) -
G_synthesis/Grow_lod5 - (?, 3, 32, 32) -
G_synthesis/64x64/Conv0_up 1442816 (?, 256, 64, 64) (3, 3, 512, 256)
G_synthesis/64x64/Conv1 852992 (?, 256, 64, 64) (3, 3, 256, 256)
G_synthesis/ToRGB_lod4 771 (?, 3, 64, 64) (1, 1, 256, 3)
G_synthesis/Upscale2D_3 - (?, 3, 64, 64) -
G_synthesis/Grow_lod4 - (?, 3, 64, 64) -
G_synthesis/128x128/Conv0_up 426496 (?, 128, 128, 128) (3, 3, 256, 128)
G_synthesis/128x128/Conv1 279040 (?, 128, 128, 128) (3, 3, 128, 128)
G_synthesis/ToRGB_lod3 387 (?, 3, 128, 128) (1, 1, 128, 3)
G_synthesis/Upscale2D_4 - (?, 3, 128, 128) -
G_synthesis/Grow_lod3 - (?, 3, 128, 128) -
G_synthesis/256x256/Conv0_up 139520 (?, 64, 256, 256) (3, 3, 128, 64)
G_synthesis/256x256/Conv1 102656 (?, 64, 256, 256) (3, 3, 64, 64)
G_synthesis/ToRGB_lod2 195 (?, 3, 256, 256) (1, 1, 64, 3)
G_synthesis/Upscale2D_5 - (?, 3, 256, 256) -
G_synthesis/Grow_lod2 - (?, 3, 256, 256) -
G_synthesis/512x512/Conv0_up 51328 (?, 32, 512, 512) (3, 3, 64, 32)
G_synthesis/512x512/Conv1 42112 (?, 32, 512, 512) (3, 3, 32, 32)
G_synthesis/ToRGB_lod1 99 (?, 3, 512, 512) (1, 1, 32, 3)
G_synthesis/Upscale2D_6 - (?, 3, 512, 512) -
G_synthesis/Grow_lod1 - (?, 3, 512, 512) -
G_synthesis/1024x1024/Conv0_up 21056 (?, 16, 1024, 1024) (3, 3, 32, 16)
G_synthesis/1024x1024/Conv1 18752 (?, 16, 1024, 1024) (3, 3, 16, 16)
G_synthesis/ToRGB_lod0 51 (?, 3, 1024, 1024) (1, 1, 16, 3)
G_synthesis/Upscale2D_7 - (?, 3, 1024, 1024) -
G_synthesis/Grow_lod0 - (?, 3, 1024, 1024) -
G_synthesis/images_out - (?, 3, 1024, 1024) -
G_synthesis/lod - () -
G_synthesis/noise0 - (1, 1, 4, 4) -
G_synthesis/noise1 - (1, 1, 4, 4) -
G_synthesis/noise2 - (1, 1, 8, 8) -
G_synthesis/noise3 - (1, 1, 8, 8) -
G_synthesis/noise4 - (1, 1, 16, 16) -
G_synthesis/noise5 - (1, 1, 16, 16) -
G_synthesis/noise6 - (1, 1, 32, 32) -
G_synthesis/noise7 - (1, 1, 32, 32) -
G_synthesis/noise8 - (1, 1, 64, 64) -
G_synthesis/noise9 - (1, 1, 64, 64) -
G_synthesis/noise10 - (1, 1, 128, 128) -
G_synthesis/noise11 - (1, 1, 128, 128) -
G_synthesis/noise12 - (1, 1, 256, 256) -
G_synthesis/noise13 - (1, 1, 256, 256) -
G_synthesis/noise14 - (1, 1, 512, 512) -
G_synthesis/noise15 - (1, 1, 512, 512) -
G_synthesis/noise16 - (1, 1, 1024, 1024) -
G_synthesis/noise17 - (1, 1, 1024, 1024) -
images_out - (?, 3, 1024, 1024) -
--- --- --- ---
Total 26219627
G Params OutputShape WeightShape
--- --- --- ---
latents_in - (?, 512) -
labels_in - (?, 0) -
lod - () -
dlatent_avg - (512,) -
G_mapping/latents_in - (?, 512) -
G_mapping/labels_in - (?, 0) -
G_mapping/PixelNorm - (?, 512) -
G_mapping/Dense0 262656 (?, 512) (512, 512)
G_mapping/Dense1 262656 (?, 512) (512, 512)
G_mapping/Dense2 262656 (?, 512) (512, 512)
G_mapping/Dense3 262656 (?, 512) (512, 512)
G_mapping/Dense4 262656 (?, 512) (512, 512)
G_mapping/Dense5 262656 (?, 512) (512, 512)
G_mapping/Dense6 262656 (?, 512) (512, 512)
G_mapping/Dense7 262656 (?, 512) (512, 512)
G_mapping/Broadcast - (?, 14, 512) -
G_mapping/dlatents_out - (?, 14, 512) -
Truncation - (?, 14, 512) -
G_synthesis/dlatents_in - (?, 14, 512) -
G_synthesis/4x4/Const 534528 (?, 512, 4, 4) (512,)
G_synthesis/4x4/Conv 2885632 (?, 512, 4, 4) (3, 3, 512, 512)
G_synthesis/ToRGB_lod6 1539 (?, 3, 4, 4) (1, 1, 512, 3)
G_synthesis/8x8/Conv0_up 2885632 (?, 512, 8, 8) (3, 3, 512, 512)
G_synthesis/8x8/Conv1 2885632 (?, 512, 8, 8) (3, 3, 512, 512)
G_synthesis/ToRGB_lod5 1539 (?, 3, 8, 8) (1, 1, 512, 3)
G_synthesis/Upscale2D - (?, 3, 8, 8) -
G_synthesis/Grow_lod5 - (?, 3, 8, 8) -
G_synthesis/16x16/Conv0_up 2885632 (?, 512, 16, 16) (3, 3, 512, 512)
G_synthesis/16x16/Conv1 2885632 (?, 512, 16, 16) (3, 3, 512, 512)
G_synthesis/ToRGB_lod4 1539 (?, 3, 16, 16) (1, 1, 512, 3)
G_synthesis/Upscale2D_1 - (?, 3, 16, 16) -
G_synthesis/Grow_lod4 - (?, 3, 16, 16) -
G_synthesis/32x32/Conv0_up 2885632 (?, 512, 32, 32) (3, 3, 512, 512)
G_synthesis/32x32/Conv1 2885632 (?, 512, 32, 32) (3, 3, 512, 512)
G_synthesis/ToRGB_lod3 1539 (?, 3, 32, 32) (1, 1, 512, 3)
G_synthesis/Upscale2D_2 - (?, 3, 32, 32) -
G_synthesis/Grow_lod3 - (?, 3, 32, 32) -
G_synthesis/64x64/Conv0_up 1442816 (?, 256, 64, 64) (3, 3, 512, 256)
G_synthesis/64x64/Conv1 852992 (?, 256, 64, 64) (3, 3, 256, 256)
G_synthesis/ToRGB_lod2 771 (?, 3, 64, 64) (1, 1, 256, 3)
G_synthesis/Upscale2D_3 - (?, 3, 64, 64) -
G_synthesis/Grow_lod2 - (?, 3, 64, 64) -
G_synthesis/128x128/Conv0_up 426496 (?, 128, 128, 128) (3, 3, 256, 128)
G_synthesis/128x128/Conv1 279040 (?, 128, 128, 128) (3, 3, 128, 128)
G_synthesis/ToRGB_lod1 387 (?, 3, 128, 128) (1, 1, 128, 3)
G_synthesis/Upscale2D_4 - (?, 3, 128, 128) -
G_synthesis/Grow_lod1 - (?, 3, 128, 128) -
G_synthesis/256x256/Conv0_up 139520 (?, 64, 256, 256) (3, 3, 128, 64)
G_synthesis/256x256/Conv1 102656 (?, 64, 256, 256) (3, 3, 64, 64)
G_synthesis/ToRGB_lod0 195 (?, 3, 256, 256) (1, 1, 64, 3)
G_synthesis/Upscale2D_5 - (?, 3, 256, 256) -
G_synthesis/Grow_lod0 - (?, 3, 256, 256) -
G_synthesis/images_out - (?, 3, 256, 256) -
G_synthesis/lod - () -
G_synthesis/noise0 - (1, 1, 4, 4) -
G_synthesis/noise1 - (1, 1, 4, 4) -
G_synthesis/noise2 - (1, 1, 8, 8) -
G_synthesis/noise3 - (1, 1, 8, 8) -
G_synthesis/noise4 - (1, 1, 16, 16) -
G_synthesis/noise5 - (1, 1, 16, 16) -
G_synthesis/noise6 - (1, 1, 32, 32) -
G_synthesis/noise7 - (1, 1, 32, 32) -
G_synthesis/noise8 - (1, 1, 64, 64) -
G_synthesis/noise9 - (1, 1, 64, 64) -
G_synthesis/noise10 - (1, 1, 128, 128) -
G_synthesis/noise11 - (1, 1, 128, 128) -
G_synthesis/noise12 - (1, 1, 256, 256) -
G_synthesis/noise13 - (1, 1, 256, 256) -
images_out - (?, 3, 256, 256) -
--- --- --- ---
Total 26086229
D Params OutputShape WeightShape
--- --- --- ---
images_in - (?, 3, 256, 256) -
labels_in - (?, 0) -
lod - () -
FromRGB_lod0 256 (?, 64, 256, 256) (1, 1, 3, 64)
256x256/Conv0 36928 (?, 64, 256, 256) (3, 3, 64, 64)
256x256/Conv1_down 73856 (?, 128, 128, 128) (3, 3, 64, 128)
Downscale2D - (?, 3, 128, 128) -
FromRGB_lod1 512 (?, 128, 128, 128) (1, 1, 3, 128)
Grow_lod0 - (?, 128, 128, 128) -
128x128/Conv0 147584 (?, 128, 128, 128) (3, 3, 128, 128)
128x128/Conv1_down 295168 (?, 256, 64, 64) (3, 3, 128, 256)
Downscale2D_1 - (?, 3, 64, 64) -
FromRGB_lod2 1024 (?, 256, 64, 64) (1, 1, 3, 256)
Grow_lod1 - (?, 256, 64, 64) -
64x64/Conv0 590080 (?, 256, 64, 64) (3, 3, 256, 256)
64x64/Conv1_down 1180160 (?, 512, 32, 32) (3, 3, 256, 512)
Downscale2D_2 - (?, 3, 32, 32) -
FromRGB_lod3 2048 (?, 512, 32, 32) (1, 1, 3, 512)
Grow_lod2 - (?, 512, 32, 32) -
32x32/Conv0 2359808 (?, 512, 32, 32) (3, 3, 512, 512)
32x32/Conv1_down 2359808 (?, 512, 16, 16) (3, 3, 512, 512)
Downscale2D_3 - (?, 3, 16, 16) -
FromRGB_lod4 2048 (?, 512, 16, 16) (1, 1, 3, 512)
Grow_lod3 - (?, 512, 16, 16) -
16x16/Conv0 2359808 (?, 512, 16, 16) (3, 3, 512, 512)
16x16/Conv1_down 2359808 (?, 512, 8, 8) (3, 3, 512, 512)
Downscale2D_4 - (?, 3, 8, 8) -
FromRGB_lod5 2048 (?, 512, 8, 8) (1, 1, 3, 512)
Grow_lod4 - (?, 512, 8, 8) -
8x8/Conv0 2359808 (?, 512, 8, 8) (3, 3, 512, 512)
8x8/Conv1_down 2359808 (?, 512, 4, 4) (3, 3, 512, 512)
Downscale2D_5 - (?, 3, 4, 4) -
FromRGB_lod6 2048 (?, 512, 4, 4) (1, 1, 3, 512)
Grow_lod5 - (?, 512, 4, 4) -
4x4/MinibatchStddev - (?, 513, 4, 4) -
4x4/Conv 2364416 (?, 512, 4, 4) (3, 3, 513, 512)
4x4/Dense0 4194816 (?, 512) (8192, 512)
4x4/Dense1 513 (?, 1) (512, 1)
scores_out - (?, 1) -
--- --- --- ---
Total 23052353
/stylegan/training/dataset.py
的有被寫死一些內容(如下 codeblock)。tfr_lods
應該是一個陣列,這個陣列內容會被設置成目前已存在的 dataset 的 tfr_lods
和預設要訓練的各個解析度的 dataset 是否相同,而預設的內容是要存放 4X4 到 1024X1024 的解析度的 dataset。必須把 99 行註解掉才能一次只跑一個 dataset。
...
# Determine shape and resolution.
max_shape = max(tfr_shapes, key=np.prod)
self.resolution = resolution if resolution is not None else max_shape[1]
self.resolution_log2 = int(np.log2(self.resolution))
self.shape = [max_shape[0], self.resolution, self.resolution]
tfr_lods = [self.resolution_log2 - int(np.log2(shape[1])) for shape in tfr_shapes]
assert all(shape[0] == max_shape[0] for shape in tfr_shapes)
assert all(shape[1] == shape[2] for shape in tfr_shapes)
assert all(shape[1] == self.resolution // (2**lod) for shape, lod in zip(tfr_shapes, tfr_lods))
assert all(lod in tfr_lods for lod in range(self.resolution_log2 - 1))
...
下載全部資料集前必須先調整磁碟大小,調整完後發現資料集被 google drive 因為太多人存取而限制下載。
今天在繼續嘗試昨天沒辦法下載的512X512和1024X1024資料集。512X512的資料集發現會在下載開始1小時後被強行中斷;1024X1024的資料集會因為google drive的存取限制(太多人存取)而不能下載。
目前發現原因是 403 forbidden,主因就是 google drive 的限制流量導致。但是不能確定實際狀況為何。
1111_courses
image_processing
:::
Oct 26, 20245-fold cross validation The original dataset is randomly divided into five subsets, each containing an equal number of samples. The model is trained and evaluated five times. In each iteration, one of the folds is held out as the validation set, while the remaining four folds are used for training. The model is trained on the four training folds and then tested on the held-out fold. This process is repeated five times, with each fold serving as the validation set once. The performance metrics, such as accuracy or error rate, are recorded for each iteration. The performance metrics obtained from the five iterations are averaged to provide an overall assessment of the model's performance. 將資料集拆分成五等分,每次訓練時以四等分作為 training dataset,剩下一份作為 validation dataset。訓練時紀錄每次的表現並在最後回報五次平均的表現以作為模型的整體評估。 AUC
Jun 14, 2023姓名:張議隆 學號:F74082125 :::spoiler TOC ::: 15.19 :::info Suppose we have the following requirements for a university database that is used to keep track of students’ transcripts:
Jun 13, 2023姓名:張議隆
Jun 6, 2023or
By clicking below, you agree to our terms of service.
New to HackMD? Sign up