# Kaoutar's project
###### tags: `bachelors projects`
---
## To do
- Triangle tiling with missing particles
- Method optimization
- Quasicrystals
## To plan
- midterm presentation with christian
## Paper: Configurational Entropy of Network-Forming Materials (Vink and Barkema)
Entropy in information theory: Shannon entropy
$$H(n)=\sum_i p(i)\log_2p(i)$$
### Explanation overcounting terms papers (Rinske)
Assume we have a network where we form graphs by selecting random points in space and choosing the nearest $n$ particles. The first question we want to answer is how large on average the volume $V_m$ is within which all selected random points would lead to the same graph.
To find $V_m$ we first estimate how the average distance $\Delta r^n$ between the $n^\text{th}$ and the $(n+1)^\text{th}$ neighbour depends on $n$. This distance is roughly proportional to the radius of the volume $V_m$ (points selected within this distance would typically lead to the same graph)
The volume $V_n$, which contains the $n$ nearest particles is on average given by
$$
V_n = \frac{n}{\rho}\propto n.
$$
Consequently the radius of $V_n$ scales as
$$
r_n \propto n^{1/d}.
$$
To find the $\Delta r^n$ dependence on $n$ in the large $n$ limit, we can write
$$
\Delta r_n = \frac{\Delta r_n}{\Delta n}\Delta n \approx \frac{\partial r_n}{\partial n} \propto \frac{\partial n^{1/d}}{\partial n} = n ^{1/d-1},
$$
since $\Delta n=1$. Consequently
$$V_m\propto (n^{1/d-1})^d=n^{1-d}.$$
The size of $V_m$ determines how many points $M$ we can select before overcounting graphs. Specifically, we want to limit the number of measurements we do on a single snapshot, to ensure that we do not keep finding the exact same group of $n$ particles over and over. To do that, we would like to make sure that the probability of sampling the same volume $V_m$ multiple times remains low. Hence, $M$ scales like
$$M\propto V_m^{-1} =(n^{1-d})^{-1} = n^{d-1}.$$
The paper by Vink and Barkema empirically finds a reasonable prefactor for this scaling as well.
Next, we consider the effects of picking a random point in space on the overcounting of graphs.
If $V_m$ is small, many graphs that are actually similar will be classified as being different due to small fluctuations at the boundary (i.e. one or a few particles at the edge of the graph are different, but the core is the same). This would even happen in a simple crystal, where the entropy we are looking for is clearly zero.
Thinking of a simple crystal with one particle in the unit cell, we could say that if we pick our point anywhere in the unit cell, we expect to end up with the same environment, apart from these surface effects. But the number of actually different graphs we end up with inside the volume of one unit cell is given by $V_u / V_m$ where $V_u$ is the unit cell volume. So we are overcounting by a factor proportional to $1/V_m \propto n^{d-1}$.
Note that a similar argument holds for any environment, not just crystals. For example, one could argue that any point chosen within the Voronoi cell of the same particle should lead to a graph that has the same "core".
Now we want to know the influence of this overcounting on the entropy. Instead of observing one type of "core" graph with probability $p_i$, this graph gets cut into a large number $f \propto n^{d-1}$ of variations of this graph, each with slightly different combinations of particles at the surface. The probability of observing each of these variations is therefore reduced to $p_i/f$, where we assume that in the limit of large $n$, the probability of each variation becomes roughly equal.
So instead of the target $H(n)$, we instead measure:
$$
\begin{align*}
\tilde{H}(n)&=-\sum_i f \frac{p_i}{f}\log\frac{p_i}{f}\\
&=-\sum_i p_i \log p_i + \sum_i p_i \log f\\
&= H(n) + \log f
\end{align*}
$$
We see that to find that entropy we have to correct the measured entropy $\tilde{H}(n)$ by a term $$g(n) = - \log f = c - (d-1)\log(n),$$
where in the last step $c$ is a constant, and we have used that $f$ is proportional to $n^{d-1}$. Note that the constant is irrelevant for our calculations, since we only care about the slope of $H(n)$, not its absolute value.
## Presentation 11 sep
## Perfect triangle tiling in Mathematica
First I tried using Graphics triangles:

However, a graph is easier to work with. First try was not so great, but I got there:


### Useful functions:
*NearestNeighborGraph[points]*: gives a graph with all nearest neigbours connected with an edge.
*EdgeList[graph]*: gives a list of all edges in a graph.
## Method building
Unit cell vs inner + outer border.
Not saving all graphs to increase speed
## Concepts and definitions
- Configurational entropy (entropy from an information studies and physics view)
- Starting with perfect lattice
- Voronoi cell
- Then: imperfections
- No more Voronoi cell -> Boundary conditions (what happens at the edges?)
- Symbols: S, n, m, N
## Tekst
$S(n)=-\sum_i p_i\ln(p_i)$, $p_i\approx \frac{f_i}{m}$, $s = \lim_{n\to\infty}[S(n+1)-S(n)]$.
For small $n$, $S(n)$ still depends strongly on $n$ because every extra node reveals new local variations in the structure: each increment in $n$ gives access to new structural configurations. For larger $n$, the sampled subgraphs start to reflect te overall structure and the probability distribution stabilizes. The probabilities $p_i$ of finding specific subgraphs converge to the same probabilities if we would sample the entire structure.
### Perfect triangle lattice (2 methods)
We start with the method described by Vink and Barkema (2002) for a trivial case of configurational entropy: for a perfect lattice, with all particles at the same distance of each other, we expect a configurational entropy of 0.
#### Building the lattice part 1.<font color=#f00> (moet dit in resultaten of methoden?)</font>
The first task, is to build a graph that represents our lattice structure. To do so, we start with generating all lattice points according to a triangle lattice, placing lattice points at regular intervals defined by the lattice constant $a=1$. Once the positions of the graph nodes are established, we construct the graph edges by connecting each node to all nodes seperated by the lattice constant.
#### Vink and Barkema's method (random point)
For such a perfect lattice, it was sufficient to look at a single Voronoi cell, since every Voronoi cell and its surroundings is identical. <font color=#f00>(see fig ...)</font>

Within this cell, random points were chosen, around which the nearest $n$ nodes were taken to form a subgraph. We made the lattice sufficiently large, such that any subgraph extracted with less or equal to $n$ nodes, would be completely within the borders of this lattice. To estimate $S(n)$, we sample $m$ of these subgraphs and count how oftench unique subgraph appears. To determine whether two subgraphs (and thus, two areas) are similar, we check whether they are isomorph. After filtering out the contribution of the subgraph border to the entropy <font color=#f00>(ref naar uitleg?)</font>, the results show that $S(n)$ increases for small $n$, but continues to flatten out for larger $n$ <font color=#f00>(see fig...)</font>. The slope of $S(n)$ approaches zero, confirming we are measuring the configurational entropy, which is zero for a perfect triangle lattice structure.
When sampling random points in space, we should consider that nearby points might give the same subgraph, which would lead to overcounting (and impact our measured entropy). **(nog uitwerken!)**
#### Random nodes
Since there seemed to be no clear reason to be choosing random points in our 2D space instead of choosing random nodes, we tried this approach as well. This approach induced a new problem: at the borders of the subgraphs, there are usually several nodes at equal distance from the center node. We tried choosing randomly from these equal nodes, which resulted in a different contribution of the subgraph borders to the entropy than we found before. To illustrate this, we look at the scenario where $n=9$. The seven inner nodes always form the same core, but there are six nodes at the border from which we must add two to that core, to eventually find a subgraph of size $n=9$. When choosing these last two nodes randomly, we find certain new subgraphs, that we have not seen in our first method <font color=#f00>(see fig...)</font>. <font color=#f00>(Hoe ga ik hier nog mee verder?)</font>
### Imperfect triangle lattice
Next, we tried the method described by Vink and Barkema for an imperfect triangle lattice structure. In this structure, we have $N$ lattice sites, of which $xN$ are empty (no particle) and $(1-x)N$ are non-empty. We can no longer look at one Voronoi cell and sample within that cell, since the Voronoi cells are not equal. To be able to sample anywhere in the structure, we implement periodic boundaries.
#### Building the lattice part 2. <font color=#f00>(moet dit in resultaten of methoden?)</font>
To create the periodic boundaries, we must define a periodic distance function, which accounts for the "wrapping" of the lattice. Nodes on one edge of the lattice, can now recognize the nodes on the opposite edge, and are once again connected when their distance is equal to the lattice constant. To check whether this was succesful, we check whether every node has six edges.
Once we implemented the periodic boundaries, the only thing left to do, is to make sure that $xN$ lattice sites are empty. We randomly remove that amount of nodes and all edges connected to those nodes, to acquire a graph that represents a periodic triangle lattice structure with empty lattice sites.
#### Expectations
In this case, (using $S(n)=-k_B n[x\ln x+(1-x)\ln(1-x)$]) <font color=#f00> (wat doe ik met het feit dat: $n\neq N$?) </font> we expect the the entropy density to be
\begin{align}
s&=\lim_{n\to\infty} [S(n+1)-S(n)]\\
&= \lim_{n\to\infty} -k_B[x\ln x+(1-x)\ln(1-x)]\\
&= -k_B[x\ln x+(1-x)\ln(1-x)].
\end{align} The maximum entropy we should be finding for large $n$, is $S_{\text{max}}=-\sum_i^N\frac{1}{N}\ln\frac{1}{N}=-\ln(\frac{1}{N})$.
### Temp
$$Z(N,V,T) = \frac{1}{h^{3N}N!}\int d \mathbf{r}^N d\mathbf{p}^N \exp(-\beta (U_{pot} + U_{kin})) $$
$$Z(N,V,T) = \frac{1}{\Lambda^{3N}N!}\int d \mathbf{r}^N \exp(-\beta (U_{pot})) $$
$$Z(N,V,T) = \sum_{k \in \mathrm{configs}} \frac{1}{\Lambda^{3N}N!}\int_k d \mathbf{r}^N \exp(-\beta U_{pot}) $$
$$Z(N,V,T) = \sum_{k \in \mathrm{configs}} \exp(-\beta F^\mathrm{vib}_k) $$
$$Z(N,V,T) \simeq N_\mathrm{configs} \exp(-\beta F^\mathrm{vib}) $$
OR
$$Z(N,V,T) = N_\mathrm{configs} \langle \exp(-\beta F^\mathrm{vib})\rangle $$
Then:
$$F(N,V,T) = -k_B T \log Z = F_{vib} - k_B T \log N_\mathrm{configs} = F_{vib} - T S_{conf}$$