# Data 100 Spring 2020 Discussion 2 Problem 2 Walkthrough ## Golden Rules of finding the probability of an event 1) List all the ways that the event can happen and divide by the total number of possible outcomes 2) List all the ways that the event can **NOT** happen and divide by the total number of possible outcomes; subtract this probability from 1 3) If the event involves multiple trials (e.g. flipping a coin 10 times), imagine yourself doing each trial one at a time and combine the chances of each trial appropriately ## Background on ${n \choose k}$ Some of you may not be familiar with the notation ${n \choose k}$ (pronounced "$n$ choose $k$"), so I thought I'd say a little about what it means and what it's used for. I'll be explaining ${n \choose k}$ through an example: flipping a coin $n$ times. **You have a coin that lands heads with probability $p$. You flip this coin 3 times. What is the probability of getting 2 heads?** We want to find the probability of flipping 2 heads out of 3 flips, which means we want to get 2 heads and 1 tails. Each flip of the coin has probability $p$ of being heads, and therefore probability $(1-p)$ of being tails. Putting this all together and using Golden Rule #3, we end up with the probability of 2 heads as $p^2(1-p)$. But if you actually repeat this experiment many times, you'll see that the actual probability of getting 2 heads out of 3 is higher than $p^2(1-p)$. So what's missing from our answer? We need to account for the fact that there is more than 1 way of getting 2 heads out of 3. You could get HHT, HTH, and THH. Our original answer of $p^2(1-p)$ only accounts for 1 of 3 ways to get 2 heads. Thus, we have to multiply by 3 to get the actual probability of flipping 2 heads: $3p^2(1-p)$. Counting the number of ways to get 2 heads out of 3 is fairly simple to do by hand as we did, but when the numbers get bigger it becomes a much more difficult task. Thus, we need to find some other way of finding the number of ways of getting $k$ heads out of $n$ flips. One way of thinking about counting the number of ways of getting $k$ heads out of $n$ flips is to imagine that there are $n$ blanks and we need to fill in $k$ of them with heads. Going back to our example of flipping a coin 3 times and looking for 2 heads, we can imagine that we have 3 blanks $$_ _ _$$ and we need to choose 2 of these blanks to put an "H" into. We could choose blanks 1 and 2, blanks 1 and 3, and blanks 2 and 3 (corresponding to HHT, HTH, THH). This way of thinking about the problem of counting the number of ways of getting $k$ heads out of $n$ flips as the number of ways of choosing $k$ blanks out of $n$ blanks to put an "H" into lends itself directly to ${n \choose k}$. In fact, this is the reason why ${n \choose k}$ is pronounced "$n$ choose $k$": because it represents the number of ways to choose a set of $k$ items out of $n$ items. Algebraically, ${n \choose k} = \frac{n!}{k!(n-k)!}$. Treat this just as a formula for you to be able to plug numbers into. If we apply the ${n \choose k}$ concept to our original example: we want to find the number of ways to get 2 heads out of 3. We can write this as ${3 \choose 2}$. Plugging into the formula, we get that ${3 \choose 2} = 3$. Notice that this matches up with the 3 that we multiplied by earlier! ## Binomial Probabilities Putting everything we've said together, we can derive a **formula for the number of ways to get $k$ heads out of $n$ flips when the probability of a single flip being heads is $p$**: $${n \choose k}p^k(1-p)^{n-k}$$ The ${n \choose k}$ term accounts for the number of different ways to flip $k$ heads out of $n$, and the $p^k(1-p)^{n-k}$ term deals with the fact that we want $k$ heads and $n-k$ tails. This formula is called a *binomial probability* because it characterizes what statistics calls a binomial setting. ## Discussion 2, Problem 3 This problem asks which expression(s) below correspond to the probability of getting $k$ successes out of $n$ trials with probability $p$ of success. 1) $p^k(1-p)^{n-k}$ 2) ${n \choose k} p^k (1-p)^{n-k}$ 3) ${n \choose n-k} p^k (1-p)^{n-k}$ 4) $\frac{n!}{k!(n-k)!} p^k (1-p)^{n-k}$ From our previous discussion, we know that expressions 2 and 4 must be correct. What might come as a surprise is that expression 3 is also correct. To see why, think about where the ${n \choose k}$ term came from. We wanted to pick $k$ blanks out of $n$ blanks to place an "H" into to correspond to getting $k$ heads out of $n$ flips. But another perfectly valid way of thinking about this problem is to instead consider how we can get $n-k$ tails (this works because getting $k$ heads and getting $n-k$ tails is exactly the same thing since heads and tails are the only options). Applying the same logic we did before, we can reformulate the problem of counting the number of ways of getting $n-k$ tails as the number of ways of choosing $n-k$ blanks to put a "T" into. This quantity is ${n \choose n-k}$, which is why expression 3 is also correct. ## Discussion 2, Problem 4 This problem is useful for the homework. It asks us to find which expression(s) below correspond to the probability of getting *at least 1* success out of $n$ trials when the probability of a success is $p$. 1) $np(1-p)^{n-1}$ 2) $\sum_{k=2}^{n} {n \choose k} p^k (1-p)^{n-k}$ 3) $\sum_{k=1}^{n} {n \choose k} p^k (1-p)^{n-k}$ 4) $1-p^n$ 5) $1 - (1-p)^n$ For this problem, I'll walk through each expression and explain what probability each of them represent. ### Expression 1: $np(1-p)^{n-1}$ This is the probability of getting *exactly 1* success. If you plug $k=1$ into the binomial probability formula, this is the expression you will get. ### Expression 2: $\sum_{k=2}^{n} {n \choose k} p^k (1-p)^{n-k}$ This is the probability of getting *at least 2* successes. Note that getting at least 2 successes means getting 2 successes, or 3 successes, or 4 successes, etc. all the way up to $n$ succeses. We can expand the sum: $$\sum_{k=2}^{n} {n \choose k} p^k (1-p)^{n-k} = {n \choose 2} p^2 (1-p)^{n-2} + {n \choose 3} p^3 (1-p)^{n-3} + ... + {n \choose n} p^n (1-p)^{n-n}$$ Each of the terms on the right side of the equals sign corresponds to the probability of getting 2 successes, 3 successes, etc. all the way to $n$ successes. ### Expression 3: $\sum_{k=1}^{n} {n \choose k} p^k (1-p)^{n-k}$ This is the probability of getting *at least 1 success*. You can use the same logic as the previous expression to see why. ### Expression 4: $1-p^n$ This is the probability of getting *at least 1 failure*. To see why, it's easier to think in terms of actual events rather than probabilities. Let's imagine (again) that we have a coin that lands heads with probability $p$ and we're going to flip that coin $n$ times. Using Golden Rule #3, we can see that $p^n$ is the probability of getting all heads. Using Golden Rule #2, we can see that $1 - p^n$ is the probability of **NOT** getting all heads. Another way of saying not getting all heads is getting at least one tails. ### Expression 5: $1 - (1-p)^n$ This is the probability of getting *at least 1 success*. To see why, we can use similar logic as the previous expression. Using Golden Rule #3, we can see that $(1-p)^n$ is the probability of getting all tails. Using Golden Rule #2, we can see that $1 - (1-p)^n$ is the probability of **NOT** getting all tails. Another way of saying not getting all tails is getting *at least 1 heads*. ## Conclusion Hopefully this walkthrough clarified any of your doubts. If you have any questions, please feel free to email me at rkunani@berkeley.edu, come to OH, or ask on Piazza!