Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More โ
Practical Deep Learning - Day 2
Schedule
Time |
Contents |
Instructor(s) |
09:00-09:10 |
Welcome and recap |
YW |
09:10-10:00 |
Monitor the training process |
YW |
10:00-10:10 |
Coffee Break |
|
10:10-11:00 |
Advanced Layer Types |
AM |
11:00-11:10 |
Coffee Break |
|
11:10-11:50 |
Transfer learning & Outlook |
AM |
11:50-12:00 |
Wrap-up |
|
Setup your environment
LUMI
Go to Open On-demand interface: https://www.lumi.csc.fi/pun/sys/dashboard/
Choose Jupyter
WARNING: The container had to be upgraded, so some parameters have changed. Look out for the
emoji. Also,
means you should leave that field blank.
- Project: project_465001310
- Partition: small-g

Resources
- Number of CPU cores: 8
- Memory (GiB): 8
- Number of GPUs: 1
- Time: 4:00:00

Settings
- Working directory: /scratch/project_465001310
- Show advanced settings:
- Custom Python type: container
- Modules to load:

- Path to container with Python:
- Container arguments:

- Init script for container:
- Enable virtual environment:

- Save settings: Give it a name, like
deep-learning-intro
, so that you can use it later!
Launch Jupyter, and change to your directory
The init script should have created one for you under the scratch directory at env-deep-learning-intro/workspace/$USER/deep-learning-intro/notebooks/lumi
Get the new notebooks!
Either run git pull
from the terminal or use the JupyterLab git interface.
- Navigate to the directory containing the notebooks
- Click on the left sidebar and then on the cloud button with a down arrow, as shown below
Image Not Showing
Possible Reasons
- The image was uploaded to a note which you don't have access to
- The note which the image was originally uploaded to has been deleted
Learn More โ
How to use TensorBoard
on LUMI
Image Not Showing
Possible Reasons
- The image was uploaded to a note which you don't have access to
- The note which the image was originally uploaded to has been deleted
Learn More โ
You can ask questions about the workshop content at the bottom of this page. We use the Zoom chat only for reporting Zoom problems and such.
- Is this how to ask a question?
- Yes, and an answer will appear like so!
3. Monitor the training process
-
I wonder if it is common to combine two optimizers, for example, use "adam" until the loss func. reaches a given threshold, and then switch to something better when one is closer to the minima.
- Optimizers are initialized when the training starts of, so it is not easy to switch in between. However, what one could do is choose the "learning rate" scheduler instead of a fixed one which is the default. Something like a cosine function is commonly used: "high learning rate" to start with and then reduce it progressively.
- Thank you.
- Also a side note, Adam optimizer is actually a combination of two older optimizers: RMSprop and Momementum.

-
Why have we chosen 200 epochs to begin with?
- This is arbitary, and to demonstrate overfitting; but as you saw at the end of the lesson, using
EarlyStopping
manages to end the training before 200 epochs. This is the way to go.
-
My question is: Do you aim always for the global minimum? Is the algorithm able to test the whole landscale or only a certain area around the current location? If we don't reach the global minimum, but only a local one, is this then not correct?
- the aim of training is to find the global minimum, but in reality it is difficult to reach the global minimum if the constructed model is too complex
- so our aim is trying to find the global minium, as much closer to the global minimum as possible
- Would you then perfom the search x times and average the result?
- this is not a good option. A better way is try to explore all parameters and try to get a "minimum", even this minimum is not the global minimum
- more explanations: in theory we want the global minimum, but in practice, we rarely reach the global minimum
- instead, we always settle for a local minimum or just some "good local minimum" points that can give good performance in the prediction
- How do you explore this in an automatic way? That is not so clear to me, and when do you stop with your search please?
- we always start at some points in the parameter space with optimization algorithms (GD, SGD, batch GD, etc.). following the gradient/slope of the loss function locally to descend to a local minimum nearby.
- then we do the test the model, if the set of parameters are not good, we will fine-tune model parameters until we get a good performance (very good feedback on the test dataset)
- there are some properties (like F1 value) to validate the evaluation.
- we can have lot of model parameters, and we always pick up the set that gives the lowest loss function as this set of parameters is the most promising one.
-
Could you explain the term "baseline" in "baseline prediction"?
- baseline is the reference point for us to evaluate the trained model
- if the trained model outperforms the baseline (with small value for loss function), we can say that the trained model is acceptable
- if the trained model cannot pass the line setted by the baseline (that is why it is called baseline), the constructed model is not a good model
- then we have two options: abandon the constructed model, or try to improve the model with lots of strategies
- we provided three exercises at Step 9, you can explore each strategy to see how to improve the model
4. Advanced layer types
Number of features in Dollar Street 10
How many features does one image in the Dollar Street 10 dataset have?
-
A. 64
-
B. 4096
-
C. 12288 +++++
-
D. 878
-
Building a CNN looks like it uses stencil computations, is it?
- yes, u r absolutely right.
- building and executing a CNN does involve stencil-like computations, particularly in the convolutional layers
-
I've seen Google use images for their CAPCHA. Some of them have dark edges or are too saturated. Why black edges though? To confuse the model?
- dark edges (or high contrast) in CAPCHA can be deliberate design choice to confuse automated models, especially in CNN which is good at pattern recognition
- yes, black edges (or dark borders) is used to confuse the model
- the reason is that CNN is sensitive to edges and contrast
-
Could you explain in more details the relation: Accuracy vs. Loss? Is it something similar to Bias vs. Variance?
Network depth: Try it on your own!
https://enccs.github.io/deep-learning-intro/4-advanced-layer-types/#refine-the-model
5. Transfer learning & Outlook
Keras Applications
which property to choose to validate the trained model?
- one is
balanced accuracy
Image Not Showing
Possible Reasons
- The image was uploaded to a note which you don't have access to
- The note which the image was originally uploaded to has been deleted
Learn More โ
Always ask questions at the very bottom of this document, right above this.