Piyush Ranjan
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    --- title: Content description: duration: 5400 card_type: cue_card --- ### <font color='blue'>**Content**</font> - Log Normal Distribution. - Poisson Distribution --- title: Log Normal distribution, Key Characteristics of Log Normal Distribution description: duration: 5400 card_type: cue_card --- ### <font color='blue'>**Introduction**</font> (2 mins) Greetings everyone, In our ongoing journey to explore the world of probability distributions, we will look into two more intriguing distributions: the Log-Normal distribution and the Poisson distribution. Each of these distributions plays a unique and vital role in various fields. Today, we'll explore the characteristics, applications, and real-world significance of these two distributions. ## <font color='blue'>**Log Normal Distribution**</font> (10-12 mins) ``` Imagine that you are a data scientist at Amazon/Swiggy/Zomato You've collected a bunch of data on delivery times, ``` Generally how much time delivery takes? - Let's assume around 30 mins, maybe sometimes less than 30 maybe more Now, if we take thousands of these delivery time data points and plot a histogram, - It may be a bit skewed to the right. Sometimes deliveries are quicker than 30 minutes, and sometimes they take a bit longer. <font color='purple'>**The lognormal distribution is a continuous probability distribution that models this type of right-skewed data.**</font> <br> <font color='purple'>Suppose $X$ is the actual data</font> - Now the beauty of log normal is when you take <font color='purple'>**the logarithm (log) of the actual delivery time data**</font> and plot a new histogram, - The new <font color='purple'>**histogram tends to be more symmetrical</font>, like a bell curve**. In simple terms, <font color='purple'>the Log-Normal Distribution takes the original data, does some math magic (logarithm), and makes it look more like a normal, symmetrical distribution</font>. <br> So, in the language of distributions, we say the <font color='purple'>"original delivery time data (X) is log-normal."</font> - It means if X follows a log-normal distribution, log(X) follows a normal (bell-shaped) distribution. <br> You can **exponentiate a normal distribution (exp (X)) to obtain the lognormal distribution**. In this manner, you can transform back and forth between pairs of related log normal and normal distribution <img src = https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/057/458/original/pd1.png?1700493114 height = 250 width = 300 > We can see in this image that the original data follows log normal distribution and if we take log of this disttribution, it'll look more symmetrical like a bell shaped curve. We will implement this on a real life dataset <img src = https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/057/459/original/pd2.png?1700493273 height = 300 width = 400 > ### <font color='purple'>**Real Life dataset**</font> Let's have a look into the dataset which has waiting time records Code: ```python= !wget --no-check-certificate https://drive.google.com/uc?id=1SIZC1FZvZAhVzRvnZ7IFWBUDavvzIafJ -O waiting_time.csv ``` >Output: ``` waiting_time.csv 100%[===================>] 1.58M --.-KB/s in 0.01s 2024-01-18 10:09:25 (134 MB/s) - ‘waiting_time.csv’ saved [1656272/1656272] ``` > <font color='purple'>Importing Libraries</font> > Code: ```python= import numpy as np import pandas as pd import seaborn as sns import matplotlib.pyplot as plt from scipy.stats import poisson, binom ``` Code: ```python= data = pd.read_csv("/content/waiting_time.csv") data.head() ``` >Output: <img src = https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/057/461/original/pd3.png?1700493434 height = 200 width = 130 > Let's plot this data Code: ```python= sns.histplot(data,bins=100) ``` >Output: <img src = https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/057/462/original/pd4.png?1700493536 height = 350 width = 500 > <font color='purple'>**Observation**</font> We can observe that it is right skewed. Now, > <font color='purple'>Q. How can we answer questions related to the data which is distributed in this way?</font> We can transform this data using a **log** and let's see the distribution of transformed data. ### <font color='purple'>**Log Normal Distribution Parameters**</font> As we know, the random variable for the original data is $X$ and after transforming it using log, the random variable of transformed data is $log(x)$. If **$μ$ and $σ$** (mean and standard deviation) of $log(x)$ is given to me and I want to find the mean and SD of Original data (log normal distribution) ($X$), then it is given by: - **Mean of original $X$** = ${\displaystyle \exp \left(\mu +{\frac {\sigma ^{2}}{2}}\right)}$ <img src='https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/057/522/original/Screenshot_2023-11-21_at_6.17.01_PM.png?1700571046'> - **Variance of original $X$** = ${\displaystyle [\exp(\sigma ^{2})-1]\exp(2\mu +\sigma ^{2})}$. <img src ='https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/057/523/original/Screenshot_2023-11-21_at_6.18.07_PM.png?1700571088' width=300> We don't need to remember these formulas as we have functions in python to carry out our analysis. <font color='purple'>Let's transform our original data:</font> Code: ```python= data_log = np.log(data) data_log ``` >Output: <img src = https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/057/464/original/pd6.png?1700493830 height = 400 width = 160 > Code: ```python= sns.histplot(data_log, bins=100) ``` >Output: <img src = https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/057/465/original/pd7.png?1700493949 height = 400 width = 550 > <font color='purple'>**Observation**</font> - We can observe that after applying logarithm to the right skewed data, we get the distribution which is approximately normal. - We converted our data in such a format such that we are able to utilize the properties of gaussian distribution This is known as <font color='purple'>**log normal transformation**</font> <br> > <font color='purple'>Q. Why did we specifically choose log?</font> 1. <font color='purple'>**Symmetry:**</font> - Our original data is right-skewed, with a long tail on the right side indicating occasional very long delivery times. - The **logarithmic transformation** compresses larger values more than smaller values. - The extreme right tail is pulled in, making the distribution more symmetric. 2. <font color='purple'>**Stabilizing Variance:**</font> - In the original delivery time data, you might observe that the variance (spread) of delivery times increases as the mean delivery time increases. - Taking the logarithm can **stabilize the variance**. We can observe that the spread of delivery times after the transformation is more consistent across different independent variables. In summary, Applying a logarithmic transformation to the right skewed data can make the distribution more symmetric and stabilize the variance, making it potentially more useful for certain statistical analyses. <font color='red'>**Instructor's Note:**</font> (you can use these examples to explain the above points if required) Let's understand this with the help of an example: Suppose we have values like, ``` X: 1, 10, 100, 1000, 10000 ``` Now let's take a log of all these values, we will get: ``` - ln(1) = 0 - ln(10) = 2.30 - ln(100) = 4.60 - ln(1000) = 6.90 - ln(10000) = 9.21 ``` **Observation**: - We can clearly observer that after taking log of all the values it compresses larger values more than smaller values. - 10,000 got transformed into 9.21 and we can clearly see how much compressed the values got. - This can bring symmetry to the distribution. **Example on stabilizing variance:** We can also observe that after applying the log, variance also got stabilized. - Let's consider a simple example to illustrate this: Suppose you have a set of positive numbers with increasing variance: Original Data: $1,2,4,8,16,32,…$ - If you observe the differences between consecutive values, you'll see that the differences increase: Differences: $1,2,4,8,16,…$. Now, if you take the logarithm of the original data: - Log-Transformed Data: $ln⁡(1),ln⁡(2),ln⁡(4),ln⁡(8),ln⁡(16),…$ The differences between the log-transformed values are now constant around $0.693$ Differences: $ln⁡(2)−ln⁡(1),ln⁡(4)−ln⁡(2),ln⁡(8)−ln⁡(4),ln⁡(16)−ln⁡(8),…$ This constant difference suggests a stabilized variance, which can be beneficial in statistical analyses. ## <font color='blue'>**Key Characteristics of a Log-Normal Distribution**</font> (3-5 mins) Let's understand the key characteristics of a log-normal distribution. 1. **Positivity:** - All values in a log-normal distribution are <font color='purple'>positive because the logarithm of any positive number is always real</font>. 2. **Skewedness:** - If the original data is right-skewed, the log-normal transformation can make it more symmetric and bell-shaped. 3. **Multiplicative Processes:** - Log-normal distributions are suitable for modelling scenarios where the final outcome is influenced by the product of independent factors. - In our dataset, we are aware that <font color='purple'>delivery times may get affected by various independent factors like traffic, order processing time</font>, etc. <font color='red'>***Instructor note:***</font> (if you want to explain the above points using example then you can refer this) Example for Positivity: > Suppose you have a set of numbers following a normal distribution with a mean (μ) of 0 and a standard deviation (σ) of 1. The log-normal distribution is obtained by exponentiating these normal distribution values. **Normal Distribution:** Let's generate some random values from a normal distribution: - Random Values from Normal Distribution: $-1.5, 0.8, -0.3, 1.2, -0.7$. Exponentiate to Obtain Log-Normal Distribution: Now, exponentiate each of these values: - Log-Normal Distribution Values: $e^{-1.5}, e^{0.8}, e^{-0.3}, e^{1.2}, e^{-0.7}$ Calculating these values: - Log-Normal Distribution Values: $0.223, 2.225, 0.741, 3.320, 0.496$ As we can see, all values in the resulting log-normal distribution are positive. The exponentiation ensures that even if the original values from the normal distribution could be positive or negative, the transformation to the log-normal distribution makes them all positive. **Example for multiplicative process:** > Imagine you have a population of bacteria in a controlled environment, and the daily growth of this population is influenced by various independent factors. Each day, the number of new bacteria added is not a fixed amount but is instead a percentage increase based on these factors. **Daily Growth in Logarithmic Scale:** - Let's say you measure the daily growth of the bacteria population in a logarithmic scale. The daily growth, when expressed in this logarithmic scale, follows a normal distribution. **Total Population:** - Now, if you're interested in predicting the total population over time, you would consider the cumulative effect of daily growth. - The total population on a given day $(P_t)$ can be expressed as the product of the previous day's population $(P_{t-1})$ and the exponential of the daily growth in the logarithmic scale $(e^{Xt})$. $P_t = P_{t-1} * e^{Xt}$ The distribution of the total population $(P_t)$ over time would follow a log-normal distribution. In this scenario, the log-normal distribution is appropriate because the growth of the population is influenced by multiple independent factors that operate on a multiplicative scale. Each day's growth is not just an addition but a percentage increase based on the current population size and the cumulative effects of various factors. (till here) In summary, a log-normal distribution is a good fit for positively skewed, ensuring **positivity, and aligning with multiplicative processes** often seen in real-world scenarios. Now, let's see what is poisson distribution --- title: Poisson Distrbution description: duration: 5400 card_type: cue_card --- ### <font color='blue'>**Poisson Distribution**</font> (10-12 mins) > <font color='purple'>**Scenario: Traffic at a Toll Booth**</font> ``` Imagine you're at a toll booth on a highway, observing the number of vehicles passing through the toll booth in a given time period. ``` The Poisson distribution comes into play when we want to <font color='purple'>**model the number of events that occur in a fixed interval of time or space**</font>. - In this case, <font color='purple'>vehicles passing through the toll booth are our event</font>. <img src='https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/057/524/original/toll.gif?1700571863' height=200> <br> <font color='purple'>**Explanation:**</font> The Poisson distribution is a **discrete probability distribution** particularly useful when dealing with events that occur randomly and independently, but with a known average rate. In our toll booth scenario, we can make a few key observations: 1. <font color='purple'>**Fixed Interval:**</font> - Let's say we want to study the number of vehicles passing through the toll booth in a specific time period, <font color='purple'>say 1 hour</font>. 2. <font color='purple'>**Average Rate:**</font> - We have an average rate of vehicles passing through the toll booth, <font color='purple'>let's say 30 vehicles per hour</font>. - It is denoted as $λ$ (lambda), which represents the average rate of occurrence of the event within a given interval. - Here, <font color='purple'>λ is 30 vehicles per hour</font>. <br> Now, the poisson distribution **helps us answer some questions** like: > <font color='purple'>Q. What is the probability of exactly 25 vehicles passing through the toll booth in the next hour?</font> > <font color='purple'>Q. What is the probability of more than 40 vehicles passing through the toll booth in the next hour?</font> This toll booth scenario is just one example of how the Poisson distribution is applied in various fields. The graph below shows examples of Poisson distributions with different values of λ. <img src = https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/057/470/original/pd13.png?1700495067 height = 400 width = 600 > - When **λ is low, the distribution is much longer on the right side of its peak than on its left**. - As **λ increases, the spread of distribution also increases** - If you <font color='purple'>keep increasing, the distribution looks more and more similar to a normal distribution</font>. ### <font color='blue'>**Poisson Distribution Formula**</font> If a random variable $X$ follows a Poisson distribution, then the probability that $X = k$ successes can be found by the following formula: - **$P[X=k] = \Large \frac{λ^k \ * \ e^{-λ}}{k!}$** where: - **λ**: rate or mean number of successes that occur during a specific interval - **k**: number of successes - **e**: a constant equal to approximately 2.71828 This is also known as **Probability Mass Function (PMF)** of poisson distribution as using this formula we can calculate the probability of exact events. <br> ### Example 1: ``` suppose a particular hospital experiences an average of 2 births per hour. We can use the formula above to determine the probability of experiencing 0, 1, 2, 3 births, etc. in a given hour: ``` Here, rate ($λ$) = 2 e = constant= 2.71828 $P[X=0] = \Large \frac{2^0 \ * \ e^{-2}}{0!} = 0.1353$ $P[X=1] = \Large \frac{2^1 \ * \ e^{-2}}{1!} = 0.2707$ $P[X=2] = \Large \frac{2^2 \ * \ e^{-2}}{2!} = 0.2707$ $P[X=3] = \Large \frac{2^3 \ * \ e^{-2}}{3!} = 0.1804$ <br> Here, we can also find the probability using the function in python i.e. **poisson.pmf()** as it is asking for the probability of an exact value - We have to pass 2 parameters in this function, **k:** number of events and **mu:** average or rate Code: ```python= # P[x=0] poisson.pmf(k=0, mu=2) ``` >Output: ``` 0.1353352832366127 ``` Code: ```python= # P[x=1] poisson.pmf(k=1, mu=2) ``` >Output: ``` 0.2706705664732254 ``` Code: ```python= # P[x=2] poisson.pmf(k=2, mu=2) ``` >Output: ``` 0.2706705664732254 ``` Code: ```python= # P[x=3] poisson.pmf(k=3, mu=2) ``` >Output: ``` 0.18044704431548356 ``` Let's look into more examples. ### <font color='purple'>**Applications:**</font> **1) Football Match Goals** Imagine we have collected data for all the football matches ever happened, now we want to analyze the distribution of goals. - We observe that the average goal per 90 mins match is 2.5 - So the rate will be 2.5 goals per match (λ = 2.5). > <font color='purple'>Q. If I want to know the probability of getting 1 goal in last 30 mins?</font> This is where poisson distribution comes into play. Here, the rate is 2.5 goal/ 90 mins (per match) which mean average number of goals in 90 mins - What will be the average number of goals in 45 mins? 2.5 goals -> 90 mins Average goals for half of the time will be half of the total average rate Rate: 2.5/2 = 1.25/45 mins (per 45 mins) Similarly, we can define a range for 30 mins, 1.25 goals -> 45 mins x goals? -> 30 mins x = (30 * 1.25)/45 Rate = 0.833 goals/30 mins So, > **Q: How long should you stay to witness a goal on average?** - On average, <font color='purple'>**staying at least 45 minutes increases the probability of witnessing a goal**</font> during a football match. - Because staying at least 45 minutes aligns with the average goal rate of 1.25 goals per 45 minutes. - This duration maximizes the likelihood of experiencing a goal during a football match based on the observed rate of scoring. Next example, <br> **2) Support Phone Calls** Think about a support centre that receives 100 calls per hour. - So the average call received per minute will be, 100 calls -> 60 mins (1 hr) x calls -> 1 min Rate: 100/60 = 1.666 calls/min This allows us to analyze the probability of receiving a certain number of calls within a specific time frame. - The call center management can use this rate to determine the optimal number of customer service representatives to have on duty during different time periods. - For instance, during peak times, when the call rate is high, more staff may be required to handle the increased volume. One more example <br> **3) Hospital OPD Patients** Consider a hospital's Outpatient Department (OPD) where, on average, 200 patients visit in a day (λ = 200). - The average hourly rate of patient arrivals can be calculated by dividing the daily rate by the number of working hours. - For example, if the facility operates for 8 hours, the hourly rate would be $\frac {200}{8} = 25$ patients per hour. - The facility can use this information for resource planning, such as determining the optimal number of staff, doctors, and examination rooms needed to handle the expected patient load efficiently. <br> These are some real life examples where poisson distribution can help us understand the liklihood of an event occurring in a specific time interval or space ## <font color='blue'>**Rules of Poisson Distribution**</font> (5 mins) **Key rules that govern the Poisson distribution:** 1. <font color='purple'>**Counting:**</font> - The Poisson distribution is tailored for **counting the number of discrete events happening within a fixed interval** - The events can take on values like 0, 1, 2, 3, and so on. 2. <font color='purple'>**Independence:**</font> - The occurrence of one event should not affect the occurrence of another event. - Events are considered to be independent if the probability of one event happening doesn't change based on whether another event has occurred. > **For example**, - <font color='purple'>**if an accident occurs in Delhi at 4 PM, it will have no impact on the occurrence of an accident in Mumbai at the same time**</font>. - Each event is independent of the other, and the outcome in one location does not influence or affect the outcome in the other location. <br> 3. <font color='purple'>**Rate (λ or μ):**</font> - The distribution is defined by a single parameter often denoted as λ (lambda) or μ (mu), which represents the average rate of occurrence of the event within the given interval. - This rate remains constant throughout the interval and doesn't change based on the occurrences. 4. <font color='purple'>**No Simultaneous Events:**</font> - The Poisson distribution assumes that there cannot be more than one occurrence of the event at exactly the same time or within an infinitesimally small interval of time or space. - For instance, <font color='purple'>if a family of five people enters a store, it's counted as a single event, not five separate events.</font> Let's look at the some examples using Poisson distribution --- title: Examples of poisson description: duration: 5400 card_type: cue_card --- ## <font color='purple'>Example 2:</font> (5 mins) ``` A city sees 3 accidents per day on average. Find the probability that there will be 5 accidents tomorrow. ``` Solution: Given, The rate is given as 3 accidents per day on average, - $λ = 3$ Let “$X$” denote the number of accidents tomorrow. - We say “$X$” is Poisson distributed with rate ($λ$) = 3 <br> So, the probability that there will be 5 accidents tomorrow is $P[X=5]$ By using the formula, - $P[X=5] = \Large \frac{λ^5 \ * \ e^{-λ}}{5!} = \frac{3^5 \ * \ e^{-3}}{5!}$. Using python, Code: ```python= # P[X=5] poisson.pmf(k=5, mu=3) ``` >Output: ``` 0.10081881344492458 ``` There is a 10% chance that there will be 5 accidents tomorrow. <br> **Next question** > **Q. Find the probability that there will be 5 or fewer accidents tomorrow**. Here we want to calculate $P[X≤5]$, We will use **poisson.cdf()** here as we want to calculate cumulative probability. - $P[X≤5] = P[X=0] + P[X=1] + P[X=2]+ P [X=3]+ P[X=4] + P[X=5]$ We can directly find it using poisson.cdf() Code: ```python= # P[X ≤ 5] poisson.cdf(k=5, mu=3) ``` >Output: ``` 0.9160820579686966 ``` ## <font color='purple'>Example 3:</font> (3 mins) ``` Let “X” be the number of typos in a page in a printed book, with mean of 3 typos per page. What is the probability that a randomly selected page has atmost 1 typo? ``` Here, rate ($λ$) = 3 we want to find for atmost 1 type, so we need to find $P[X≤1]$ which will be $P[X=0] + P[X=1]$. We can directly use **poisson.cdf()** here Code: ```python= # P[x≤1] poisson.cdf(k=1, mu=3) ``` >Output: ``` 0.1991482734714558 ``` Code: ```python= prob = poisson.pmf(k=0, mu=3) + poisson.pmf(k=1, mu=3) prob ``` >Output: ``` 0.1991482734714558 ``` There is a 19% chance that a randomly selected page has atmost 1 typo ## <font color='purple'>Example 4:</font> (3 mins) ``` The shop is open for 8 hours. The average number of customers is 74 - assume Poisson distributed. (a) What is the probability that in 2 hours, there will be at most 15 customers? (b)What is the probability that in 2 hours, there will be at least 7 customers? ``` For the first question, we need to find $P[X≤15]$ in 2 hrs - The rate for this scenario will be: 74 customers -> 8 hrs x customers -> 2 hrs Rate = 2 * 74/8 = 74/4 Rate = 18.5 (for 2 hrs) So, $λ$ = 18.5, $k$ = 15 so **poisson.cdf** will be Code: ```python= poisson.cdf(k=15, mu=18.5) ``` >Output: ``` 0.24902769151284776 ``` > <font color='purple'>What is the probability that in 2 hours, there will be at least 7 customers?</font> Here we want to find $P[X≥7]$ which will be $1 - P[X≤6]$ Here we will find the probability till 6 customers and then subtract it from 1 will give us the probability that there will be atleast 7 customers. Code: ```python= # P[X≥7] 1 - poisson.cdf(k=6, mu=18.5) ``` >Output: ``` 0.9992622541111789 ``` --- title: Quiz 1 description: duration: 60 card_type: quiz_card --- # Question It is known that a certain website makes 10 sales per hour. In a given hour, what is the probability that the site makes exactly 8 sales? # Choices - [x] 0.1125 - [ ] 0.3544 - [ ] 0.25 - [ ] 0.674 --- title: Quiz 1 explanation description: duration: 5400 card_type: cue_card --- ### Quiz 1 explanation We want to find $P[X=8]$ Given, λ = 10 and x = 8 Code: ```python= prob = poisson.pmf(k=8, mu=10) prob ``` >Output: ``` 0.11259903214902009 ``` --- title: Quiz 2 description: duration: 60 card_type: quiz_card --- # Question It is known that a certain hospital experience 4 births per hour. In a given hour, what is the probability that 4 or less births occur? # Choices - [ ] 0.585 - [x] 0.6288 - [ ] 0.4723 - [ ] 0.82 --- title: Quiz 2 explanation description: duration: 5400 card_type: cue_card --- ### Quiz 2 explanation Here we want to find $P[X≤4]$ Given, λ = 4 and x = 4, Code: ```python= prob = poisson.cdf(k=4, mu=4) prob ``` >Output: ``` 0.6288369351798734 ``` --- title: Quiz 3 description: duration: 60 card_type: quiz_card --- # Question An e-commerce website experiences an average of 10 credit card transactions per day. What is the probability that there will be at least 12 credit card transactions in a given day? # Choices - [ ] 0.2381 - [ ] 0.1263 - [x] 0.3032 - [ ] 0.1755 --- title: Quiz 3 explanation description: duration: 5400 card_type: cue_card --- ### Quiz 3 explanation Here we want to find $P[X\ge12]$ where $X$ is the number of credit card transactions in a day. Given, $λ$ = 10, $x$ = 12 Code: ```python= prob = 1 - poisson.cdf(k=11, mu=10) prob ``` >Output: ``` 0.30322385369689386 ``` poisson.cdf(k=11, mu=10) gives the **probability of having 11 or fewer transactions**. To get the probability of at least 12 transactions, **we subtract this probability from 1**. --- title: Poisson approximation to Binomial description: duration: 5400 card_type: cue_card --- ## <font color='blue'>**Poisson approximation to Binomial**</font> (5-7 mins) ``` There are 80 students in a kinder garden class. Each one of them has 0.015 probability of forgetting their lunch on any given day. (a) What is the average or expected number of students who forgot lunch in the class? (b) What is the probability that exactly 3 of them will forget their lunch today? ``` Solution: First question, <img src = https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/057/471/original/pd14.png?1700496007 height = 150 width = 500 > Code: ```python= rate = 80*0.015 # average or rate rate ``` >Output: ``` 1.2 ``` **Conclusion**: This implies that, on average, there are 1.2 students who forget their lunch in a given period. <br> > <font color='purple'>(b) What is the probability that exactly 3 of them will forget their lunch today?</font> here, k = 3 and lambda = 1.2 We can directly use **poisson.pmf()** Code: ```python= poisson.pmf(k=3, mu=1.2) ``` >Output: ``` 0.08674393303071422 ``` There is 8.67% chance that exactly 3 of them will forget their lunch today <br> > <font color='purple'>**Q. Can I model this question into binomial distribution?**</font> We have 80 students, we can define two probabilites here - probability of success $P(s)$ = student forgot the lunch = $0.015$ - probability of failure $P(f)$ = $1 = P(s) = 1 - 0.015$ We want $P[X=3]$, - we can represent it as **out of 80 trials, I want 3 success** Using binomial formula, it will be - $^{80}C_3 (0.015)^3 (1-0.015)^{77}$ we just make this question of binomial Code: ```python= binom.pmf(k=3, n=80, p=0.015) # Large n, small p, np=mu ``` >Output: ``` 0.08660120920447566 ``` We gt the similar answers using both **poisson and binomial** <br> <font color='purple'>In binomial</font> - we are counting the number of successes in $n$ trials where $P(s)$ = $p$ <font color='purple'>In poisson</font> - Counting number of occurrences in a given time interval. <br> - Now, for 1 success we have probability p so for n success, the probability will be 1 success -> p n success -> ? for n success -> np Here, the $P(s)$ for 1 student is $0.015$. So $P(s)$ for 80 students will be $80 * 0.015 = 1.2$ So we can say that <font color='purple'>**$\Large λ = np$**</font> This approximation is known as the <font color='purple'>**Poisson approximation to the binomial distribution**</font>. ### <font color='purple'>**Conditions for a reasonable approximation:**</font> - When the number of trials $(n)$ is large and the probability of success $(p)$ is small, the binomial distribution can be approximated by a Poisson distribution. - For a reasonable approximation: - $np ≤ 10$ - $p ≤ 0.1$ <font color='red'>***Intructor Note:***</font> The concept of "large enough" for the number of trials $(n)$ in the context of the Poisson approximation to the binomial distribution doesn't have a fixed, universally agreed-upon threshold. However, a commonly used guideline is that $n$ should be such that $np≤10$ - If the above conditions met the we can use the Poisson distribution to estimate the probabilities of different event counts. <br> So, **in the context of our problem**, <font color='purple'>$n=80$ and $p=0.015$, the conditions $np\le10$ and $p≤0.1$ are satisfied.</font> - We can use the Poisson distribution with $λ=80×0.015$ as an approximation to the binomial distribution. <img src = https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/057/472/original/pd15.png?1700496171 height = 100 width = 600 > --- title: Conclusion description: duration: 5400 card_type: cue_card --- ## <font color='blue'>Conclusion</font> With that, we wrap up this lecture. Please go through the lecture once to clearly understand all the topics that we have covered today.

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully