
<p style="text-align: center"><b><font size=5 color=blueyellow>Julia for High-Performance Data Analysis - Day 2</font></b></p>
:::success
**Julia for High-Performance Data Analysis — Schedule**: https://hackmd.io/@yonglei/julia-hpda-2025-schedule
:::
## Schedule
| Time | Contents | Instructor |
| :---------: | :------: | :--------: |
| 09:00-09:05 | Welcome | YW |
| 09:05-09:20 | Motivation | AA |
| 09:20-10:10 | Data Formats and Dataframes | FF |
| 10:10-10:20 | Break |
| 10:20-11:05 | Linear algebra | FF |
| 11:05-11:15 | Break | |
| 11:10-11:55 | Linear regression | DE |
| 11:55-12:00 | Q/A | |
## Lesson materials and recorded videos
:::info
- **Introduction to programming in Julia**
- lesson material: https://enccs.github.io/julia-intro/
- recorded video: https://www.youtube.com/watch?v=EYNlE-zma7A&list=PL2GgjY1xUzfDlGVcvl757nEOxICgcGSWM&index=1
- **Julia for high-performance scientific computing**
- lesson material: https://enccs.github.io/julia-for-hpc/
- recorded videos: https://www.youtube.com/watch?v=laCl9cXGOk4&list=PL2GgjY1xUzfDlGVcvl757nEOxICgcGSWM&index=2
- **Julia for high-performance data analytics**
- lesson material: https://enccs.github.io/julia-for-hpda/
:::
---
:::danger
You can ask questions about the workshop content at the bottom of this page. We use the Zoom chat only for reporting Zoom problems and such.
:::
## Questions, answers and information
- Is this how to ask a question?
- Yes, and an answer will appear like so!
### 1. [Motivation](https://enccs.github.io/julia-for-hpda/motivation/)
### 2. [Data Formats and Dataframes](https://enccs.github.io/julia-for-hpda/dataformats-dataframes/)
- names(df) issues
- David: names(df) works in Julia 1.11.3 and DataFrames v1.7.0. To get the names of the columns of the dataframe. As Yevgen pointed out in the zoom chat there was a variable called 'names' in the session (which is why it did not work).
:::info
#### Break until 10:20
:::
### 3. [Linear algebra](https://enccs.github.io/julia-for-hpda/linear-algebra/)
- This example goes through PCA from the ground to illustrate the linear algebra components. There are packages for this as well (you can try later https://juliastats.org/MultivariateStats.jl/stable/pca/).
- For an application to mechanics of PCA: https://en.wikipedia.org/wiki/Proper_orthogonal_decomposition
- See https://en.wikipedia.org/wiki/Diagonalizable_matrix about diagonalizable matrices. See also the spectral theorem in case you want to know more: https://en.wikipedia.org/wiki/Spectral_theorem.
:::info
#### Break until 11:15
:::
### 4. [Linear regression](https://enccs.github.io/julia-for-hpda/regression/#linear-regression-with-synthetic-data)
- if we don't know the format of functions, how can we make the fitting and prediction? how can we evaluate the fitting parameters
- Good question. Root mean square error on test set of data (not used for fitting the model) can be a good way to evaluate the model. Also see the statistics box you get when doing 'fit' or 'lm' with the GLM package.
- Another questions is how to choose basis functions and what degree if polynomials are chosen. For periodic signals one may want to use trigonometric functions. In case of polynomials as small degree as possible is good.
:::warning
**Reflections and quick feedback:**
One thing that you liked or found useful for your projects?
- Learning how to read datasets properly and how to do fitting was helpful for me as a beginner
One thing that was confusing/suboptimal, or something we should do to improve the learning experience?
- I feel recording prevents me from asking questions right away in the zoom. Still I see that making the lessons available to others via the recording is more important.
- you can use zoom room to propose questions and we can remove personal info and the Q/A parts before we publish the videos. so no worries about the leaking of personal info.
- in order to have a smooth teaching, we recommend to use zoom chatbox/hackmd to share questions
:::
:::info
*Always ask questions at the very bottom of this document, right **above** this.*
:::