# Determine the sample size required for ANN
*There is no "Golden Rule" for determine the minimum sample size in machine learning.*
However, there are some "rule-of-thumbs" methods[^second], and *Alwosheel et al.*[^first] conducted extensive Monte Carlo analyses and concluded that *"minimum sample size of **fifty times** the number of weights in the ANN"* is advised.
### Number of weights in Study 3
- Number of weights can be calculated with:
$$
N_w = (I+1)*H_1 +(H1+1)*H2 + ...+(H_{n-1}+1)*H_n +(H_n+1)*O
$$
where $N_w$ is the number of weights, $I$ is the dimension of input, $H_n$ is the dimension of hidden layer $n$, and $O$ is the dimension of output.
According to the rule-of-thumb by *Heaton*[^third], starting with 1 hidden layer and $H_1$ equals to two-thirds of the $I$ is recommended.
> Number of input ($N=6$):
> age, gender, years of driving license, car brand, years owing the car, frequency of using ADS (L2)
Hence, the number of weight would be **33**.
### Required sample size
According to the above conclusions, the sample size needed for ANN training is **1650**. As described in the literature[^first], 70% of the sample would be used for training, while the rest 30% would be used for validation and testing.
As a result, the full sample size needed for this study would be around **2400** data points.
### Tutorial for applying ANN with python
- https://www.mltut.com/implementation-of-artificial-neural-network-in-python/
[^first]:Alwosheel, A., van Cranenburgh, S., & Chorus, C. G. (2018). Is your dataset big enough? Sample size requirements when using artificial neural networks for discrete choice analysis. Journal of choice modelling, 28, 167-182.
[^second]:Haykin, S. (2009). Neural networks and learning machines, 3/E. Pearson Education India.
[^third]:Heaton, J. (2008). Introduction to neural networks with Java. Heaton Research, Inc.