# GA3: Iterative improvement GRU ## Baseline  loss: 33.6489 - mae: 33.6489 - mse: 3142.6521 - val_loss: 34.4024 - val_mae: 34.4024 - val_mse: 3118.5088 ## learning rate /10  loss: 35.8021 - mae: 35.8021 - mse: 3852.7041 - val_loss: 38.0383 - val_mae: 38.0383 - val_mse: 4034.3892 ## epochs to 40  loss: 30.4760 - mae: 30.4760 - mse: 2664.6270 - val_loss: 31.7697 - val_mae: 31.7697 - val_mse: 2747.3564 ## Make more complex, since the model is not that powerful, while te curves are clean ## Hidden RNN to 16  loss: 30.2478 - mae: 30.2478 - mse: 2621.0774 - val_loss: 31.6304 - val_mae: 31.6304 - val_mse: 2710.1279 ## Add dense layer 16  loss: 30.1622 - mae: 30.1622 - mse: 2611.0139 - val_loss: 31.5466 - val_mae: 31.5466 - val_mse: 2672.4226 ## Increase windows size to 24 (one day)  loss: 30.0298 - mae: 30.0298 - mse: 2583.3704 - val_loss: 31.7806 - val_mae: 31.7806 - val_mse: 2685.6548 ## Window size back to 4, change activation function to 'swish'  loss: 30.3862 - mae: 30.3862 - mse: 2635.8213 - val_loss: 32.7821 - val_mae: 32.7821 - val_mse: 2852.8528 ## Activation on dense layer back to relu  loss: 30.1691 - mae: 30.1691 - mse: 2625.6216 - val_loss: 31.5403 - val_mae: 31.5403 - val_mse: 2700.7539 ## Add Dense(16) in front (feature expansion)  loss: 31.0327 - mae: 31.0327 - mse: 2717.8989 - val_loss: 32.1549 - val_mae: 32.1549 - val_mse: 2676.1357 ## Change to feature reduction (8) with increased Window size (24)  loss: 31.5027 - mae: 31.5027 - mse: 2780.7532 - val_loss: 32.5979 - val_mae: 32.5979 - val_mse: 2816.5869 ## RNN to 8  loss: 30.6311 - mae: 30.6311 - mse: 2683.5879 - val_loss: 31.9729 - val_mae: 31.9729 - val_mse: 2842. ## Remove first dense layer. RNN back to 16 ## Activation relu ## Activation tanh Very slow learner ## Increase learning rate tanh is shit 🤣 ## Window size 24, play with RNN RNN = 4 -> not bad, but not good (+- 32) RNN = 16 -> 31.8 RNN = 32 -> 31.72 RNN = 64 -> 31.47 😍 winner winner chicken dinner RNN = 128 -> 32 ## Try out swish again 31.65 Tanh 32.81.... Tanh only after GRU layer -> +- 31.80 ## We re-executed RNN 64 and got 32.5 as last value, so maybe we are not observing in the right way ## remove last dense layer 31.45 ## Increase batch size (256) actually looks decent ## L2 can literally go up to 1000 and not be shit. 🥶🥵🥶 Regularization does kinda nothing? No big changes from 0.01 to 1000
×
Sign in
Email
Password
Forgot password
or
By clicking below, you agree to our
terms of service
.
Sign in via Facebook
Sign in via Twitter
Sign in via GitHub
Sign in via Dropbox
Sign in with Wallet
Wallet (
)
Connect another wallet
New to HackMD?
Sign up