owned this note
owned this note
Published
Linked with GitHub
[toc]
# Homework
## HW1: Dimension Analysis
*The magnitude of pi groups are of $\mathcal{O}(1)$.*
::: info
### Example: Drag force
For
$$
F=F(\mu,v,r,\rho) ,
$$
we can formulate 2 $\pi$ groups, namely
* Reynolds number $\pi_1=Re:=\rho vr/\mu$
* $\pi_2=F/\rho v^2r^2\sim \begin{cases}1/Re & Re<Re_0 \\ constant & Re_c>Re\geq Re_0 \\ turbulence & Re\geq Re_c \end{cases}$ .
For the first case, we see $F\sim\mu vr$, and for the latter case, $F\sim \rho v^2r^2$.
Furthermore, $P\sim Fv$.
Another way to view the question is by its scale. Macroscopically, viscosity is negligible, and $F\approx F(\rho,v,r)$; microscopically, *inertia is negligible*, $F\approx F(\mu,v,r)$.
For a spherical object, we have eaxctly $F=6\pi\mu vr$ (Stokes' drift law).
Yet another way to view Reynolds number is as the ratio of time scale between two processes:
$$
Re := \frac{\rho vr}{\mu} = \frac{r^2/(\mu/\rho)}{r/v} = \frac{t_{diff}}{t_{inel}}
$$
:::
::: info
### Wave speed
We have
$$
c = c(g,\lambda,H) ,
$$
and can formulate $\pi$ groups
* $\pi_1=c/\sqrt{g\lambda}$
* $\pi_2=H/\lambda$
$\pi_2$ can be considerred as the "deepness" of a wave.
Physically speaking, when $\lambda/H\to\infty$ ("shallow"), we sepect $c$ to be independent from $\lambda$; for $H/\lambda\to\infty$ ("deep"), $c$ should be independent from $H$.
:::
# Neural networks
## Regression and Classification
### Regression
* Linear regression:
\begin{align}
\mathbf{y} &= \mathbf{w}^T\mathbf{x} + b\mathbf{1} + \mathbf{e} , \\
F(x_i) &= w_ix_i + b
\end{align}
where $\mathbf{x}$ is th input, $\mathbf{y}$ output, $b$ the intercept, $\mathbf{w}$ the weight, and $\mathbf{e}$ the error.
We wish to optimise $\mathbf{w}$ such that a loss function is minimised, e.g. $L:=||\mathbf{e}||^2$.
* Logistic regression:
The linearly regressed $F(x)$ is put through an activation function $\sigma$:
\begin{align}
F(x_i) &= \sigma(w_ix_i+b).
\end{align}
This process introduces nonlinearity, enabling more complex regression, but also making the process irreversible.
Common activation functions includes
* Sigmoid: $\sigma(x)=1/(1+\mathrm{e}^{-x})$
* Hyperbolic tangent: $\sigma(x)=\tanh(x)$
* ReLU: $\sigma(x)=\max(0,x)$
* LeakyReLU: $\sigma(x)=\begin{cases}x & x>0 \\ ax & x\leq0\end{cases}$, where $0<a<1$
The performance of a regression can be determined by a designed loss function, e.g. $L^n$-norms.
### Classification
分兩(n)個族群,目標讓直線對兩個族群的距離為最大(線分開兩個族群)
## Neural Networks
### Basic premises of neural networks
Kernel trick
座標轉換(扭曲),以區分出兩群點
Ex.2D->3D 可以找到一個面把想要的東西區分開來
接越來越多層(扭曲越多次,越非線性)
Sigma as activate function
wσ(┤)+b
*補
Neural
Classification, 找到最符合的扭曲線(並非一對一的線性函數關係)
為何神經網路可以學習
* Loss fuction
根據效果調整權重
* Activation function
用錯Activation function, 會揉出垃圾(分不開兩類資料)
Machine vs Deep
* Machine learning
經人工feature extraction在classification
* Deep learning
Replacing artificial feature extraction with deeper neural networks
如何突破神經元數的限制
當input data為圖片時,有可能會有上百萬個變數,會花太多運算資源與時間。
透過Convolution降低變數數量。
GPU使用卡上IO,可以快速做簡單運算,會被板子上memory 限制住(max: 80G)
CPU上IO可以做,跟RAM切開,做較複雜運算
### Convolutional neural network
Input to feature map (as an array)
經過多次convolution線性疊加
變成參數數量較小的output
Kernal (filter) number越多,能看到的feature越多,一次掃越多格子
filter掃一次會產生一個值
經ReLU處理後最後得到Output值
![https://indoml.com/2018/03/07/student-notes-convolutional-neural-networks-cnn-introduction/](https://indoml.files.wordpress.com/2018/03/one-convolution-layer1.png)
### Generative adversarial network
生成對抗網路
Input CNN後,反算出output
問題是後半段低自由度到高自由度的過程,會不知道要怎麼變
### Recurrent neural network
處理時間維度資料(RNN循環神經網路)
* RNN
* LSTM
* GRU
# Group meeting
## Jul 6: Precipitation Nowcasting Based on an Optimised Deep Learning model Trained with Heterogeneous Weather Data
<!-- 4 main types of precipitation in Taiwan -->
Constrains of NWP:
* Accuracy of IC
Potentially solved by deep learning?
### Model design
* CPN: encoder (convolution+GRU) -> forecaster (deconvolution(?))
* CPN_PONI: predictions of earlier times are used for later predictions
* CPN_PONI_AII: Input of heterogeneous weather data;
*Locally-connected layer* (LCL): convolution filter changes with position, cons: more parameters is expensive and can be too local; pros: ?
Model performance defined by the *CSI score*
### Result
* Success rates generally decrease with intensity and time
* Hetero data helps prediction of low-precipitation.
* LCL doesn't make a significant sifference
* Model can overstimate low-intensity events, LCL may be able to help prevent this (but 2 layers can give odd results)
## Jul 13: Seasonal pogress report
### Data assimilation
integrating observations into models
* Using DL models to produce additional input
# Literature Review
## Dynamical Adjustment of the Trade Wind Inversion Layer (Wayne Schubert, 1995)
### Problem formulation
Consider an invicid, adiabatic, and zonally symmetric atmosphere.
The primitive equations in the horizontal directions are
\begin{align}
\frac{\mathrm{D}u}{\mathrm{D}t} - fv - \frac{uv}{R_\oplus}\tan\phi &= 0 \\
\frac{\mathrm{D}v}{\mathrm{D}t} + fu + \frac{u^2}{R_\oplus}\tan\phi &= -\alpha\frac{\partial p}{\partial y} ,
\end{align}
where
\begin{align}
f &= 2\Omega\sin\phi \\
\frac{\mathrm{D}}{\mathrm{D}t} &= \frac{\partial}{\partial t} + v\frac{\partial}{\partial y}
\end{align}
With the definitions
\begin{align}
\Pi &= c_p\left(\frac{p}{p_0}\right)^\kappa = c_p\frac{T}{\theta} \\
M &= \theta\Pi + gz = c_pT+\Phi \\
\sigma &= -\frac{\partial p}{\partial\theta} ,
\end{align}
the primitive equations in the spherical coordinate become
\begin{align}
\frac{\mathrm{D}u}{\mathrm{D}t} - fv - \frac{uv}{R_\oplus}\tan\phi &= 0 \tag{1} \\
\frac{\mathrm{D}v}{\mathrm{D}t} + fu + \frac{u^2}{R_\oplus}\tan\phi &= -\frac{\partial M}{R_\oplus \partial \phi} . \tag{2}
\end{align}
We also see
\begin{align}
\frac{\partial M}{\partial \theta} = \Pi . \tag{3}
\end{align}
# Communal Lectures
## Jul 26: 王慕道
### Coordinate systems
$(x,y)$ and $(u,v)$, both centred at O, then ${x_0}^2+{y_0}^2={u_0}^2+{v_0}^2$.
***Special relativity** is the new Pythogorean theorem:*
In the $(x,t)$ and $(u,\tau)$ coordinates, we have with $c\equiv1$, ${x_0}^2-{t_0}^2={u_0}^2-{\tau_0}^2$ (*the invariance of the Lorenz norm*).
We see when $x_0=t_0$, $u_0=\tau_0$. (the 45deg lines form a *light cone* in $(x,y,t)$).
In **general relativity**, binary black hole mergers loss mass to gravitational waves.
As for the angular momentum, *supertranslation ambiguity* arises (as oppose to the supertranslation invariant of mass flux), i.e. the angular momentum flux varies by observers.
An approach is to define an ambiguity-free angular momentum.
The null geodesics (45deg light rays)
![](https://hackmd.io/_uploads/SkYw9b0q3.png)
Idealised distant observers are at the end of these geodesics.
We can use a conformal compactification to situate observers back.
At future null infinity has an infinite dimensional *Bondi-Metzner-Sachs (BMS) group*
Importance of angular momentum in general relativity:
* GNSS correction due to the gravity of Earth, but current correction does not include the flatness and rotation of the Earth