Probability in Machine Learning and Data Science

# Where is probability in all this?  slide: https://hackmd.io/@ccornwell/probability1 --- <h3>Filling in the gaps<sup>1</sup></h3> - <font size=+2>When modeling data, rarely interested in points in the data. Interested instead in points *not* in the data -- the "yet unseen" data.</font> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <font size=+2><sup>1</sup>This phrase might be common way to indicate the idea discussed here (I've seen it in multiple places). Notably, by Jesse Johnson when discussing this idea.</font> ---- <h3>Filling in the gaps</h3> ![](https://i.imgur.com/7E7qM88.png =x480) <font size=+2> <div style="text-align:right;">(from The Cartoon Introduction to Statistics)</div></font> ---- <h3>Filling in the gaps<sup>1</sup></h3> - <font size=+2>When modeling data, rarely interested in points in the data. Interested instead in points *not* in the data -- the "yet unseen" data.</font> - <font size=+2>**Population probability distribution**: what want our model to approximate. Only have observed data to use, a *sample* from population.</font> - <font size=+2 style="color:#181818;">Statistics$=$ using sample data to be confident about population data. Harness probability theory to make this work.</font> - <font size=+2 style="color:#181818;">Machine learning: also must use probability to improve models (robust, good predictions on unseen data), and to understand what the model's output *means*.</font> <font size=+2><sup>1</sup>This phrase might be common way to indicate the idea discussed here (I've seen it in multiple places). Notably, by Jesse Johnson when discussing this idea.</font> ---- <h3>Filling in the gaps<sup>1</sup></h3> - <font size=+2>When modeling data, rarely interested in points in the data. Interested instead in points *not* in the data -- the "yet unseen" data.</font> - <font size=+2>**Population probability distribution**: what want our model to approximate. Only have observed data to use, a *sample* from population.</font> - <font size=+2>Statistics$=$ using sample data to be confident about population data. Harness probability theory to make this work.</font> - <font size=+2 style="color:#181818;">Machine learning: also must use probability to improve models (robust, good predictions on unseen data), and to understand what the model's output *means*.</font> <font size=+2><sup>1</sup>This phrase might be common way to indicate the idea discussed here (I've seen it in multiple places). Notably, by Jesse Johnson when discussing this idea.</font> ---- <h3>Filling in the gaps<sup>1</sup></h3> - <font size=+2>When modeling data, rarely interested in points in the data. Interested instead in points *not* in the data -- the "yet unseen" data.</font> - <font size=+2>**Population probability distribution**: what want our model to approximate. Only have observed data to use, a *sample* from population.</font> - <font size=+2>Statistics$=$ using sample data to be confident about population data. Harness probability theory to make this work.</font> - <font size=+2>Machine learning: also must use probability to improve models (robust, good predictions on unseen data), and to understand what the model's output *means*.</font> <font size=+2><sup>1</sup>This phrase might be common way to indicate the idea discussed here (I've seen it in multiple places). Notably, by Jesse Johnson when discussing this idea.</font> --- <h3>Basics of Probability</h3> Something is being "observed"... - <font size=+2 style="color:#181818;">**Sample space** $\Omega$: the set of all possible outcomes.</font> - <font size=+2 style="color:#181818;">**Event**: a subset of $\Omega$. Could consist of just one outcome, $\{\omega\}$, where $\omega\in\Omega$, or could be more than that.</font> - <font size=+2 style="color:#181818;">*For now*: ${\bf P}(A) = |A|/|\Omega|$ (the probability of $A$ is the number of outcomes in $A$ divided by number of all possible outcomes).</font> ---- <h3>Basics of Probability</h3> Something is being "observed"... - <font size=+2>**Sample space** $\Omega$: the set of all possible outcomes.</font> - <font size=+2>**Event**: a subset of $\Omega$. Could consist of just one outcome, $\{\omega\}$, where $\omega\in\Omega$, or could be more than that.</font> - <font size=+2 style="color:#181818;">*For now*: ${\bf P}(A) = |A|/|\Omega|$ (the probability of $A$ is the number of outcomes in $A$ divided by number of all possible outcomes).</font> ---- <h3>Basics of Probability</h3> Something is being "observed"... - <font size=+2>**Sample space** $\Omega$: the set of all possible outcomes.</font> - <font size=+2>**Event**: a subset of $\Omega$. Could consist of just one outcome, $\{\omega\}$, where $\omega\in\Omega$, or could be more than that.</font> - <font size=+2>*For now*: ${\bf P}(A) = |A|/|\Omega|$ (the probability of $A$ is the number of outcomes in $A$ divided by number of all possible outcomes).</font> --- <h3>Example</h3> - <font size=+2>Say that drawing one playing card from a deck of $52$ cards.</font> - <font size=+2>Event: "an Ace is drawn". What is the probability?</font> --- <h3>Example</h3> - <font size=+2>Say that drawing two playing cards from a deck of $52$ cards.</font> - <font size=+2>Event: "at least one Queen is drawn". What is the probability?</font> --- <h3>Probability functions: Discussion</h3>