# perturb [TOC] dpdata will disturb the box size and shape and change atom coordinates randomly. ```python3 dpdata.System('./POSCAR').perturb(pert_num=3, box_pert_fraction=0.02, atom_pert_fraction=0.01, atom_pert_style='normal') ``` ### `pert_num` Each frame in the input system will generate `pert_num` frames. That means the command will return a system containing `frames of input system * pert_num` frames. ### `box_pert_fraction` A relative length that determines the length the box size change in each frame. It is just a fraction and doesn't have unit. Typlicaly, for cubic box with side length `a` , `a` will increase or decrease a random value from the intervel `[-a*box_pert_fraction, a*box_pert_fraction]`. It will also change the shape of the box. That means an orthogonal box will become a non-orthogonal box after perturbing. The angle of inclination of the box is a random variable. `box_pert_fraction` is also relating to the probability distribution function of the angle. See more details about how it will change the box below. ### `atom_pert_fraction` A relative length that determines the length atom moves in each frame. It is just a fraction and doesn't have unit. Typlicaly, for a cubic box with side length `a` , the mean value of the distance that atom moves is approximate `a*atom_pert_fraction`. ### `atom_pert_style` The probability distribution function used to change atom coordinates. available options:`'uniform' 'normal' 'const'`. `uniform` means that if the box is a cube with side length `a`, how far atoms in the cube move is a random vector with max length `a*atom_pert_fraction`. `normal` means the squares of the distance atoms move are subject to a chi-square distribution with 3 degrees of freedom (chi-square distribution can be seen as the sum of squares of normal distributed random variable. This is why we name this option as 'normal'.). If the box is a cube with side length `a`, the mean value of the distance atom moves is `a*atom_pert_fraction`. `const` means that if the box is a cube with side length `a`, the distance atom moves is always `a*atom_pert_fraction `(For triclinic box, the distances are not equal.) The direction atoms move and box deformation is random. See more details about how atoms will move below. ## The perturb details --- For each frame in the input system,dpdata will repeat the following steps `pert_num` times. That means the command will return a system containing `frames of input system * pert_num` frames. ### first step: box deform, and atom moves correspondingly #### 1. generate a perturb matrix for box and atoms --- dpdata will generate a perturb matrix $L_1=\begin{pmatrix} n_1+1 & n_2/2 & n_4/2 \\ n_2/2 & n_3+1 & n_5/2 \\ n_4/2 & n_5/2 & n_6+1 \end{pmatrix}$ $n_i,i\in\{1,2,3,4,5,6\}$ are independent variables which are subject to uniform distrubution on interval $[-box\_pert\_fraction,box\_pert\_fraction]$. That is $n_i \sim U(-box\_pert\_fraction,box\_pert\_fraction)$ #### 2.box deforms --- The origin box matrix of this frame is defined as $B_o$, usually with lower lower triangular matrix form. That is: $origin\_box\_matrix \equiv B_o=\begin{pmatrix} xx & 0 & 0 \\ xy & yy & 0\\ xz & yz & zz \end{pmatrix}$ The box matrix after deformation will be $deformed\_box\_matrix \equiv B_d = B_o L_1$ #### 3.atoms relocate in the new box. --- The atom coordinates in this frame will change correspondingly. That is, for atom with index $j, j \in\{0,1,2,3,...,atom\_num-1\}$ in the frame with coordinate $\vec r_j=(x_j, y_j, z_j)$, the coordinate after deformation $\vec r_{jd}$ will become the matrix multiplication of $r_j$ and $L_1$. That is: $\vec r_{jd} = \vec r_j L_1$ ### second step: perturb atoms randomly. #### 1. generate a random vector $\vec l_{j}$ for each atom --- For each atom with index $j, j \in\{0,1,2,3,...,atom\_num-1\}$ in the frame with coordinate $\vec r_{jd}=(x_{jd}, y_{jd}, z_{jd})$, dpdata will generate a random vector $\vec l_{j}=(l_{jx}, l_{jy}, l_{jz})$ . The method used to generate $\vec l_j$ is described below. :tada: if pert_style is 'normal': $l_{jx},l_{jy},l_{jz}$ are independent random variables which are subject to normal distribution with mean $\mu=0$ and variance $\sigma^2 = atom\_pert\_fraction^2/3$ . That is: $l_{jx},l_{jy},l_{jz} \sim N(0,atom\_pert\_fraction^2/3)$ $\Vert \vec l_{j} \Vert_2$ is the distance the atom moves, and it satisfies the following equation $\Vert \vec l_{j} \Vert_2^2=l_{jx}^2 + l_{jy}^2+l_{jz}^2$. $3\Vert \vec l_{j} \Vert_2^2/atom\_pert\_fraction^2$ will be object to the chi-square distribution with 3 degrees of freedom. That is: $\frac{3\Vert \vec l_{j} \Vert_2^2}{atom\_pert\_fraction^2} \sim \chi(3)$ The expectation value of $\Vert \vec l_{j} \Vert_2$ will be `atom_pert_fraction` .That is: $E(\Vert \vec l_{j} \Vert_2)=atom\_pert\_fraction$ :tada: if pert_style is 'uniform': $\vec l_{j}$ will be a vector point to a random point inside a 3D unit sphere and its internal space. That is: $\{ \vec l_{j} \in \mathbb{R^3} | \Vert \vec l_{j} \Vert_2 \leq atom\_pert\_fraction \}$. The point is chosen with equal probability. That means for arbitrary two subsets of the 3D sphere and its internal space $\{(x,y,z)|x,y,z\in\mathbb{R},x^2+y^2+z^2\leq atom\_pert\_fraction^2\}$, if the 'volume' of the subsets are equal, the probability that the random point is in them are equal. The following method is used to generate such $\vec l_{j}$. random direction of equal probability >let $\vec x$ become a 3 dimension standard normal distribution. That is $\vec x=(x_1,x_2,x_3), x_i(i\in\{1,2,3\})\ are\ independent.$ $x_i \sim N(0,1)$ and $\vec l_{j,unit\_surface}=\vec x /\Vert \vec x \Vert_2$ Now $\vec l_{j,unit\_surface}$ point to a point located at the surface of the 3D unit sphere. random point of equal probability > let $u$ become a random variable which is object to the uniform on interval [0,1]. That is: $u \sim U(0,1)$ and define $v$ as the 3th root of $u$ (because it is 3 dimension), That is: $v\equiv u^{1/3}$ Then the target vector $\vec l_{j}$ is $\vec l_{j}=atom\_pert\_fraction \cdot v \cdot \vec l_{j,unit\_surface}$ :tada: if pert_style is 'const': $\vec l_j \equiv atom\_pert\_fraction \cdot \vec l_{j,unit\_surface}$ The definition of $\vec l_{j,unit\_surface}$ is described in detail above. and it is obvious that $\Vert \vec l_{j} \Vert_2=atom\_pert\_fraction$ #### 2. atom moves according to $\vec l_{j}$ --- After the $\vec l_j$ generated, the atom coordinates disturbed $\vec {r_{jd,disturbed}}$ will become $\vec {r_{jd,disturbed}}=\vec r_{jd}+\vec l_{j} B_d$ $B_d$ is the box matrix after deformatiion, described in first step $\vec r_{jd}$ is the coordinate of atom j after deformatiion, described in first step. ## the results the method System.perturb() return --- At the beginning, dpdata will create an empty system. ```python perturbed_system = System() ``` After every single perturbed frame is generated, it will be appended to the `perturbed_system`. perturbed_system will contain `pert_num * frames of the input system` frames. $B_d$ will be used as the final box matrix in the perturbed frame(that is `System.data['cells'][0]`) $\vec {r_{jd,disturbed}}$ will be used as the final coordinates in the perturbed frame (that is `System.data['coords'][0][j]`) > note: 0 means the index of the frame, j means atom index, Finally dpdata will return `perturbed_system` as results.