# Monte Carlo Methods
**Monte Carlo methods** are a broad class of computational techniques that rely on *repeated random sampling* to obtain numerical results. These methods are invaluable when an exact analytic solution is either impossible or too complex to compute. The core principle is the *Law of Large Numbers*, which ensures that the average of a large number of trials will converge to the expected value.
```python
import numpy as np
import matplotlib.pyplot as plt
import random
from scipy.stats import norm
```
## Key Concepts and Properties
### Definition and Applications
Monte Carlo methods use repeated sampling to simulate an event multiple times and predict a range of outcomes for future events.
| Feature | Description |
| :--- | :--- |
| Core Method | Repeated random sampling. |
| Primary Uses | Simulating an event multiple times, predicting a range of outcomes, and sensitivity analysis of a solution. |
### Convergence and Efficiency
Monte Carlo methods generally **converge slowly** at a rate on the order of $\mathcal{O}\left(\frac{1}{\sqrt{n}}\right)$, where $n$ is the number of simulations. This means a *large number of simulations* is required to achieve accurate results. Consequently, Monte Carlo is not always the most efficient way to solve problems, though its utility often depends on the specific context.
### Law of Large Numbers
The *Law of Large Numbers* is the theoretical underpinning of Monte Carlo. It states that the *sample average* of a sequence of random variables will *converge to the expected value (expectation)* of the distribution as the number of samples increases to infinity.
### Recovering Unknown Distributions
Monte Carlo is often used when the probability distribution of a system is unknown, particularly when multiple random variables are combined through a complex function $f(X,Y,Z, \dots)$. By generating a large sample space, Monte Carlo allows us to:
1. **Visualize** the resulting probability distribution (e.g., using a histogram).
2. **Estimate** the distribution's type, or use advanced techniques like *Kernel Density Estimation (KDE)* to fit a smooth function to the resulting data.
## Example 1: Flipping a Coin
We can model a fair coin flip using a discrete random variable where $0$ represents heads and $1$ represents tails. The expected average outcome of this experiment is $0.5$.
### Monte Carlo Simulation and Law of Large Numbers
As the number of coin flips $n$ increases, the average of all flips should tend towards the expected value of $0.5$.
```python=
# Function to simulate a single coin flip (0 or 1)
def coin():
return random.randint(0,1)
# Monte Carlo simulation function
flips_mc = []
average_flip_mc = []
def monte_carlo_coin(n):
for i in range(n):
flip = coin()
flips_mc.append(flip)
flip_avg = np.mean(flips_mc)
average_flip_mc.append(flip_avg)
return
monte_carlo_coin(100)
#plot the average of the coin flips
plt.plot(average_flip_mc)
plt.axhline(y=0.5, color='r', linestyle=':')
plt.xlabel('flip number')
plt.ylabel('average of all flips')
plt.show()
```

The plot demonstrates the Law of Large Numbers. For small $n$, the average is highly oscillatory, but as $n$ increases, the average value converges to $0.5$.
## Example 2: Dice Sum
We simulate rolling two standard, fair dice and tracking the sum. The possible sums range from 2 to 12. The theoretical expected sum is 7.
### Simulation and Visualization
We first define a function to roll two dice and sum the results, and then run a Monte Carlo simulation for a large number of trials (e.g., $N=4000$).
```python=
# Function to simulate rolling two dice and summing the results
def sum_dice():
return random.randint(1,6) + random.randint(1,6)
# Monte Carlo simulation function
rolls_mc = []
rolls_average_mc = []
def monte_carlo(n):
for i in range(n):
roll = sum_dice()
rolls_mc.append(roll)
roll_avg = np.mean(rolls_mc)
rolls_average_mc.append(roll_avg)
return
monte_carlo(4000)
#plot the roll average
plt.plot(rolls_average_mc)
plt.axhline(y=7, color='r', linestyle=':')
plt.xlabel('roll number')
plt.ylabel('average of all rolls')
plt.show()
```

#### Average Convergence
The average of all rolls converges to the expected value of 7.
#### Probability Distribution
The distribution of the sums is calculated by plotting a normalized histogram of the Monte Carlo results. The discrete sums (2 through 12) begin to form a clear *triangular-like distribution*, centered at 7.
#### Fitting a Normal Distribution
The distribution of the sum of dice rolls closely resembles a *Normal distribution*—an outcome predicted by the *Central Limit Theorem* (since the sum is a function of many independent random variables). The simulated distribution can be compared against a theoretical Normal distribution using the sample mean and variance (cyan curve) and the exact mean ($\mu=7$) and variance (red dashed curve).
## Example 3: Rolling a Single Die
When modeling a single die roll, the underlying distribution is Uniform because each outcome (1 through 6) is equally likely.
### Simulating and Plotting
A Monte Carlo simulation is run (e.g., $N=600$ times) to generate the roll density.
```python=
# Function to simulate a single die roll (1 to 6)
def die_roll():
return random.randint(1,6)
#create a function that generates 600 die rolls
number = 600
roll = []
for i in range(number):
roll1 = die_roll()
roll.append(roll1)
#print the average roll value
print(np.mean(roll))
#turn into probability distribution
buckets = 6 #make 6 bins between - one for each discrete sum 2 through 12
plt.hist(roll, buckets, density = True)
plt.ylabel('Probability Distribution')
plt.xlabel('Die Roll')
plt.show()
```

The resulting histogram should approximate a uniform distribution, with each face (1-6) appearing roughly $\frac{1}{6}$ of the time. The final plot compares this histogram to the theoretical, exact uniform distribution.
#### Advantages and Disadvantages in this Context
* **Advantage:** Monte Carlo *systematically simulates* the problem, making it easy to predict the probability distribution for more complex scenarios (e.g., two-dice sum) where analytic solutions are harder to derive.
* **Disadvantage:** For a simple, known distribution (like the single die roll, $P(X=x) = 1/6$), Monte Carlo convergence is *slow*. A simple analytic calculation is much faster and more accurate.
## Example 4: Transit Travel Time
### Model 1: Sum of Two Normal Variables
Two independent bus trip times $X_1 \sim N(11, 2)$ and $X_2 \sim N(20, 4)$ are summed to find the total travel time $Y$. *Algebraically, the sum of two independent Normal variables is also Normal*, with $\mu_Y = \mu_{X_1} + \mu_{X_2} = 31$ and $\sigma^2_Y = \sigma^2_{X_1} + \sigma^2_{X_2} = 6$.
**Monte Carlo Objective:** Determine the probability $P(Y < 30 \text{ minutes})$.
```python=
N = 10000
X1 = np.random.normal(loc = 11, scale = np.sqrt(2), size = N)
X2 = np.random.normal(loc = 20, scale = np.sqrt(4), size = N)
Y = X1 + X2
#plot histogram of X1
plt.hist(X1, bins = 40,density = True)
plt.title('Distribution of first bus ride')
plt.xlabel('Time')
plt.ylabel('Proability')
#plot histogram of X2
plt.hist(X2, bins = 40,density=True)
plt.title('Distribution of second bus ride')
plt.xlabel('Time')
plt.ylabel('Proability')
plt.show()
#plot histogram of total trip Y = X1 + X2
plt.hist(Y,bins = 40, density = True)
plt.axvline(x = 30, color = 'r') #cut off for question
plt.title('Distribution of total bus ride')
plt.xlabel('Time')
plt.ylabel('Proability')
plt.show()
prob = np.sum(Y<30)/len(Y) * 100
print(prob,"%")
```



```
33.09 %
```
The simulated probability that the transit time is less than 30 minutes is about $33.09\%$.
### Model 2: Adding an Exponential Wait Time
A refinement is added: a wait time $X_3 \sim \text{Exp}(1/\mu_3)$, where $\mu_3=5$ minutes, is included: $Y = X_1 + X_2 + X_3$. Since the sum now includes an exponential distribution, the resulting distribution is *no longer strictly Normal* (it's a convolution). Monte Carlo is required to approximate the distribution.
```python=
N = 20000
X1 = np.random.normal(loc=11, scale=np.sqrt(2), size=N)
X2 = np.random.normal(loc=20, scale=np.sqrt(4), size=N)
X3 = np.random.exponential(scale=5 ,size=N)
Y = X1 + X2 + X3
#plot histogram of X3
plt.hist(X3, bins = 40,density=True)
plt.title('Distribution of wait between bus rides')
plt.xlabel('Time')
plt.ylabel('Proability')
plt.show()
#plot histogram of total trip Y = X1 + X2 + X3
plt.hist(Y,bins = 40, density = True)
plt.axvline(x = 30, color = 'r') #cut off for question
plt.title('Distribution of total bus ride')
plt.xlabel('Time')
plt.ylabel('Proability')
# calculation of probability that bus trip is less than 30 minutes
prob = np.sum(Y < 30)/len(Y)*100
print(prob, "%")
```


```
8.815000000000001 %
```
The simulated probability that the transit time is less than 30 minutes decreases significantly to about $8.815\%$, reflecting the time lost waiting.
**Remark:** When combining different random variable types (e.g., Normal and Exponential), the resulting distribution is generally unknown and must be approximated through simulation, which is a core strength of the Monte Carlo method.