# 21 - Group Testing - Collection: Easy/fast/cheap - Testing: Difficult/slow/expensive - Imagine a population of people $P_1,...,P_n$ and $k$ people are infected. - We can combine samples from the infected population and test them for the virus but is slow - We also now that we have enough samples for each person and that we can combine multiple people's samples into a single test - In these multi-person test's if any of the samples have the virus, the whole thing will come up positive ## Ex: $k=1$ Let's say $n = 1000000$ and $k=1$ - Make $1000$ groups of $1000$ people each - Test each group (1000 tests) - Exactly one group's test should come up positive call it $G$ - Individually test each sample in $G$ (1000 tests) - Exactly one of those should come up positive: total 2000 tests ## Single pooling Back to original scenario - Partition $P_1,...,P_n$ into $\frac{n}{s}$ groups $G_1,G_2,...,G_{\frac{n}{s}}$ each of size $s$ - Test each group $G_i \rightarrow \frac{n}{s}$ tests - If $G_i$ tests positive the individuals in $G_i \rightarrow S$ tests - $P_1,...,P_n$ is a random permutation of $n$ people - ![](https://i.imgur.com/VGCS3fp.png) - $S = \text{ the set of permutations of } n \text{ people}$ - $Pr(\omega) = \frac{1}{|S|} = \frac{1}{n!}$ Probability of getting and single sample space - Let $X$ be the number of tests performed, what is $E(X)$? - We know that $X$ must be greater than or equal to $\frac{n}{s}$ because we do $\frac{n}{s}$ tests - $X = \frac{n}{s} + ?$ - For each $i \in \{1,...,n\}$, define an indicator random variable $X_i$ - $X_i = 1$ if $P_i$'s sample is tested individually - $X_i = 0$ otherwise - $X = \frac{n}{s} + \sum_{i=0}^n X_i$ We need to distinguish between those are infected and those who aren't. $I = \{i: P_i \text{ is infected} \}$ $N = \{1,...,N\} - I$ (Not infected) $|I| = k$ $|N| = n-k$ Now we can change how we thinking about the sum - $X = \frac{n}{s} + \sum_{i=0}^n X_i + \sum_{i=0}^n X_i$ - The total number of tests is equal to the initial number of tests plus the tests on all the people are infected in the second tests and the tests on people who aren't - But people who are infected will always be tested so: - $X = \frac{n}{s} + \sum_{i=0}^n 1 + \sum_{i=0}^n X_i$ - Need to find out expected number of people who will not be infected but be in a group that has an infected person - When is $X_i = 1$? - Define $A_i = \text{ no one in } P_i \text{'s group is infected}$ - $\bar{A_i} = \text{ someone in } P_i \text{'s group is infected} = X_i = 1$ - $E(X_i) = Pr(X_i = 1) = 1 - Pr(A_i)$ - What is $Pr(A_i)$? - $Pr(A_i) = \frac{|A_i|}{|S|} = \frac{}{n!}$ - Procedure: - ![](https://i.imgur.com/DndZ8FC.png) - $Pr(A_i) = \frac{n \binom{n-s}{k}k! \cdot (n-k-1)!}{n!}$ - ![](https://i.imgur.com/2boScoN.png) - ![](https://i.imgur.com/sbrodmi.png) - ![](https://i.imgur.com/sWuc7RJ.png) - $E(X) = E(\frac{n}{s} + \sum_{i=1}^n X_i) = \frac{n}{s} + \sum_{i=I}^n 1 + \sum_{i=N}E(X_i)$ - $= \frac{n}{s} + k + (n-k)(1- \frac{\binom{n-s}{k}}{\binom{n-1}{k}})$ - Expected number of tests we have to perform. ## Multiple pooling ![](https://i.imgur.com/V0ptxme.png) - For each group $G_{i,j}, test group $G_{i,j}$ $(c \cdot \frac{n}{s})$ tests - For each person $x$: - If everyone one of $x$'s groups (all $c$ of them) tests positive, test $x$ ![](https://i.imgur.com/T6ngadh.png) ![](https://i.imgur.com/SY62jen.png) ![](https://i.imgur.com/sJbFJNs.png) ###### tags: `COMP2804` `Probability`