Training / Vaildation / Test Data Set

# Training / Vaildation / Test Data Set ###### tags: `archived` ## Definition of Training / Vaildation / Test Data Set * Training set: a set of examples used for learning: to fit the parameters of the classifier In the MLP case, we would use the training set to find the “optimal” weights with the back-prop rule * Validation set: a set of examples used to tune the parameters of a classifier In the MLP case, we would use the validation set to find the “optimal” number of hidden units or determine a stopping point for the back-propagation algorithm * Test set: a set of examples used only to assess the performance of a fully-trained classifier In the MLP case, we would use the test to estimate the error rate after we have chosen the final model (MLP size and actual weights) After assessing the final model on the test set, YOU MUST NOT tune the model any further! > 以上定義取自 [What is the difference between test set and validation set?](https://stats.stackexchange.com/questions/19048/what-is-the-difference-between-test-set-and-validation-set) 中 mohsen najafzadeh 的回答簡單來說， **training data set** 是用來訓練網路的 parameter (i.e. weights); **vaildation data set** 是用於檢測訓練的網路的 hyperparameter 好不好，幫助我們挑選合適的 hyperparameter (i.e. model selection); **test data set** 則是最終用來測試你訓練的網路好壞程度。 ## Usage of Training / Vaildation / Test Data Set 首先在訓練網路前，我們應該拿到兩組資料: * 一組做為訓練及挑選模型用 (稱 A ) * 另一組做為測試用途 (稱 B )。通常我們會將 A 切割成 Training Data Set 及 Vaildation Data Set。 B 則做為 Test Data Set。接著我們會利用 Training Data Set 來訓練網路，並且在 Training 過程中，進行數次 iterations 後，通常我們會使用 Vaildation Data Set 來做測試，如果認為效果不好，則可以立刻馬上放棄目前的網路或是修改目前網路的 hyperparameter，並重新訓練。假設 Training 完後，我們有 N 組訓練好的網路，則可以看 validation accuracy 來進行 model selection。最終則是丟入 test data set 做測試，此時的 testing accurary 則可以當作未來遇到未知資料的判斷準確率的參考值。 ## Vaildataion Data Set 上述只是一種較常見的做法，根據 [[Article] Nuts and Bolts of Applying Deep Learning](https://kevinzakka.github.io/2016/09/26/applying-deep-learning/) 的講述， Vaildation Data Set 不一定要從 A 中切割出來，我們也可以改成 * A 當作 Training Data Set， B 切為 Vaildation Data Set 及 Test Data Set * A 部分作為 Training Data Set，另一部分作為 Vaildation Data Set；B 也是部分作為 Test Data Set，另一部分作為 Vaildation Data Set > 附註: 在該篇文章中， Vaildation Data Set 稱為 Development Data Set。 ## Confusion about Train/Test Phase in Caffe 這邊的 Train/Test Phase 是指 Caffe 在這個階段做什麼。 `Train Phase`: 指本階段是訓練網路，及 input data 會影響網路權重的調整 `Test Phase`: 指本階段是測試網路，僅執行 forward propagation 那麼， Training/Vaildation/Test Data Set 和上述兩個階段有何關係呢? * `Train Phase` 因為會更新網路的權重，因此我們輸入的資料會是 Training Data Set。 * `Test Phase` 雖然不會更新，但是在 Caffe 中，訓練網路時會在 Train Phase 及 Test Phase 做交替。因此若我們想要在執行過程間來檢查成效是否好不好，並且**更改 hyperparameter**，或是在訓練結束後依據最終 Test Phase 的準確率來**做 model selection**，則在 Test Phase 的 input 就是 Vaildation Data Set。因此，正常來說， Test Data Set 會在網路訓練好後，並且 Deploy 後才去做測試。例如執行下方的指令 ```shell= caffe test -model examples/mnist/lenet_train_test.prototxt -weights examples/mnist/lenet_iter_10000 -gpu 0 -iterations 100 ``` 當然，如果假設今天不做 hyperparameter 的調整，也不做 model selection，則 `Test Phase` 的 input data 就會是 Test Data Set，縱使 training 時會邊顯示 test accuracy 。 ## Reference 1. [[Article] Nuts and Bolts of Applying Deep Learning](https://kevinzakka.github.io/2016/09/26/applying-deep-learning/) 2. [[Video] Nuts and Bolts of Applying Deep Learning](https://www.youtube.com/watch?v=F1ka6a13S9I) 3. [What is the difference between test set and validation set?](https://stats.stackexchange.com/questions/19048/what-is-the-difference-between-test-set-and-validation-set) 4. [Tutorial for Caffe](http://tutorial.caffe.berkeleyvision.org/tutorial/interfaces.html) 5. [Excluding Layers: Train and Test Phase](https://github.com/BVLC/caffe/wiki/Excluding-Layers:-Train-and-Test-Phase)