# **meeting 01/23** **Advisor: Prof. Chih-Yu Wang \ Presenter: Shao-Heng Chen \ Date: Jan 23, 2023** <!-- Chih-Yu Wang --> <!-- Wei-Ho Chung --> ## **Nk-to-MSE** <img src='https://hackmd.io/_uploads/H1A-Pgntp.png' width=50% weight=50%> <!-- <img src='' width=50% weight=50%> <img src='' width=50% weight=50%> --> ## **Appendix** ### **Comparison** ### ```2-16-16``` <!-- L=35, fixed 0-6 --> ```shell -------------------------------- random action: mean: -19.33353920686245 std: 14.040572090396045 max: -0.19747483730316162 min: -98.2697525024414 shape: (688,) -------------------------------- -------------------------------- optimal of (2, 16, 16) for use 0: mean: -8.774910721868277 std: 5.475275859637399 max: -0.7906286716461182 min: -41.31150436401367 -------------------------------- -------------------------------- model inference of PPO-2023-11-11-2-16-16-mse: mean: -5.930197238922119 std: 4.007392883300781 max: -0.4268815517425537 min: -30.30022621154785 shape: (1, 688) -------------------------------- ``` ### ```4-16-16``` <!-- L=35 fixed 1-5 --> ```shell -------------------------------- random action: mean: -24.42123987317085 std: 12.33276585540742 max: -3.3646061420440674 min: -81.64408111572266 shape: (816,) -------------------------------- -------------------------------- optimal of (4, 16, 16) for use 0: mean: -13.100068550378085 std: 7.561458020475508 max: -0.9182575345039368 min: -54.341976165771484 -------------------------------- -------------------------------- model inference of PPO-2023-11-12-4-16-16-mse: mean: -6.974239349365234 std: 3.2725419998168945 max: -1.1158757209777832 min: -22.820497512817383 shape: (1, 816) -------------------------------- ``` ### ```6-16-16``` <!-- L=7 fixed 8 --> <!-- L=8 fixed 8 --> ```shell -------------------------------- random action: mean: -30.355037929058074 std: 14.957692491545963 max: -5.377195358276367 min: -131.2487030029297 shape: (944,) -------------------------------- -------------------------------- optimal of (6, 16, 16) for use 0: mean: -14.236118802666665 std: 6.339965942189641 max: -3.1174938678741455 min: -45.45551300048828 -------------------------------- -------------------------------- model inference of PPO-2023-11-12-6-16-16-mse: mean: -8.91183090209961 std: 3.6185500621795654 max: -1.410496711730957 min: -22.95004653930664 shape: (1, 944) -------------------------------- ``` ### ```8-16-16``` <!-- L=8 fixed 0-6 --> ```shell -------------------------------- random action: mean: -39.48288138055801 std: 16.72815659717122 max: -8.901789665222168 min: -126.17889404296875 shape: (1072,) -------------------------------- -------------------------------- optimal of (8, 16, 16) for use 0: mean: -19.515297303915023 std: 7.351449763593945 max: -4.469184398651123 min: -53.17362976074219 -------------------------------- -------------------------------- model inference of PPO-2023-11-12-8-16-16-mse: mean: -9.629619598388672 std: 3.9008870124816895 max: -2.013472080230713 min: -31.73265838623047 shape: (1, 1072) -------------------------------- ``` ### ```10-16-16``` <!-- L=6 fixed 8-6 --> ```shell -------------------------------- random action: mean: -60.97131831550598 std: 25.754561808208482 max: -9.113597869873047 min: -245.71299743652344 shape: (1200,) -------------------------------- -------------------------------- optimal of (10, 16, 16) for use 0: mean: -40.62083244848252 std: 15.247324737096925 max: -7.568966865539551 min: -131.6959228515625 -------------------------------- -------------------------------- model inference of PPO-2023-11-12-10-16-16-mse: mean: -12.799939155578613 std: 4.8955793380737305 max: -3.185335397720337 min: -34.9901237487793 shape: (1, 1200) -------------------------------- ``` ### **Optimal baseline** ### ```2-16-16``` <!-- L=40, fixed 3 --> ```shell -------------------------------- optimal of (2, 16, 16) for use 0: mean: -8.774910721868277 std: 5.475275859637399 max: -0.7906286716461182 min: -41.31150436401367 -------------------------------- ``` ### ```4-16-16``` <!-- L=7 fixed 4 --> ```shell -------------------------------- optimal of (4, 16, 16) for use 0: mean: -13.100068550378085 std: 7.561458020475508 max: -0.9182575345039368 min: -54.341976165771484 -------------------------------- ``` ### ```6-16-16``` <!-- L=9, fixed 2 --> ```shell -------------------------------- optimal of (6, 16, 16) for use 0: mean: -14.236118802666665 std: 6.339965942189641 max: -3.1174938678741455 min: -45.45551300048828 -------------------------------- ``` ### ```8-16-16``` <!-- L=11, fixed 5 --> ```shell -------------------------------- optimal of (8, 16, 16) for use 0: mean: -19.515297303915023 std: 7.351449763593945 max: -4.469184398651123 min: -53.17362976074219 -------------------------------- ``` ### ```10-16-16``` <!-- L=8 fixed 4 --> ```shell -------------------------------- optimal of (10, 16, 16) for use 0: mean: -40.62083244848252 std: 15.247324737096925 max: -7.568966865539551 min: -131.6959228515625 -------------------------------- ```