Why Isn't the Bus Coming?

# Why Isn't the Bus Coming? ## Introduction When I was little, I read a picture book called **バスにのって** by the Japanese illustrator 荒井良二 ([Ryoji Arai](https://www.ryoji-arai.com/archives)). The story is about a guy waiting for a bus all day, murmuring "Why isn't the bus coming?" This book is part of my childhood memory, and throughout my academic journey—from elementary school to college—I often took buses to school. Sometimes, while waiting at the bus stop, I wondered: the schedule says buses come every 10 minutes on average, but why does it feel like I usually wait longer than the expected $\frac{10}{2} = 5$ minutes? Is it just my illusion, or is there some science behind this? A few days ago, I stumbled upon a [YouTube video](https://www.youtube.com/watch?v=wS54Gsq_4sE) explaining the **Waiting-Time Paradox**, which is a type of inspection paradox. One of the claims in the video is: if buses arrive randomly every 10 minutes on average, and we (commuters) also arrive randomly, the **average waiting time** is actually 10 minutes—not the expected 5 minutes. This intrigued me, so I decided to simulate the bus waiting time myself to verify if this claim is correct. --- ## Simulation I decided to simulate the bus waiting scenario with the following assumptions: - Simulated the scenario 10,000 times. - Six buses arrive randomly within a 60-minute period. - A commuter also arrives randomly within the same period. Here’s the Python code I used: ```python import numpy as np import matplotlib.pyplot as plt # Simulation parameters num_simulations = 10000 total_minutes = 60 num_buses = 6 # Simulate bus arrival times (uniformly distributed over total_minutes) bus_arrivals = np.random.uniform(0, total_minutes, (num_simulations, num_buses)) bus_arrivals.sort(axis=1) # Sort bus arrival times for each simulation # Simulate random arrival times of the person at the bus stop person_arrivals = np.random.uniform(0, total_minutes, num_simulations) # Calculate waiting times for each simulation waiting_times = np.zeros(num_simulations) for i in range(num_simulations): # Calculate differences only for buses arriving after the person waiting_times_for_buses = bus_arrivals[i][bus_arrivals[i] >= person_arrivals[i]] - person_arrivals[i] if waiting_times_for_buses.size > 0: # Take the smallest positive waiting time waiting_times[i] = np.min(waiting_times_for_buses) else: # If no buses come after the person, wrap around to the next cycle waiting_times[i] = total_minutes - person_arrivals[i] + np.min(bus_arrivals[i]) # Average waiting time average_waiting_time = np.mean(waiting_times) # Plot the histogram of waiting times as percentages plt.figure(figsize=(10, 6)) counts, bins, _ = plt.hist(waiting_times, bins=30, density=True, alpha=0.7, color='blue', edgecolor='black') counts = counts * 100 # Convert density to percentage plt.clf() # Clear the existing plot # Plot the adjusted histogram plt.bar(bins[:-1], counts, width=np.diff(bins), align='edge', color='blue', edgecolor='black', alpha=0.7) plt.xlabel('Waiting Time (minutes)') plt.ylabel('Frequency') plt.title(f'Histogram of Waiting Times (n = 6) --- Average waiting time: {average_waiting_time:.2f} minutes') plt.grid(True) # Format y-axis labels to include '%' plt.gca().yaxis.set_major_formatter(plt.FuncFormatter(lambda y, _: '{:.0f}%'.format(y))) plt.show() ``` Surprisingly, the average waiting time in the simulation is **neither 5 minutes nor 10 minutes**, but instead about **8.6 minutes**. Why? Is there something wrong with my simulation? ![6](https://hackmd.io/_uploads/B1o5GZdH1l.png) Then, I came across an article, ["A simple proof of the bus paradox"](https://hapax.github.io/mathematics/statistics/everyday/paradox-bus/), which provided the explanation. The author derived a formula for the average waiting time: $$ T = \frac{n}{n+1} \cdot t $$ Here: - $T$ is the average waiting time. - $n$ is the number of buses. - $t$ is the average interval between buses. In my simulation: - $t = 10$ (buses come every 10 minutes on average). - $n = 6$. Plugging in the values: $$\frac{6}{7} \cdot 10 \approx 8.6 $$ This matches the simulation results! Furthermore, if we increase $n$ and the total simulation period, such as 1,000 buses arriving in 10,000 minutes (still with $t = 10$), then: $$\frac{1000}{1001} \cdot 10 \approx 10 $$ This aligns with the claim in the video: the waiting time approaches $t$ as $n$ increases. ![1000](https://hackmd.io/_uploads/SJlsGW_BJe.png) --- ## Discussion This problem is more difficult than it seems. A formal proof involves concepts like **Poisson process**, **probability theory**, and the **inspection paradox**, which are beyond the scope of this article. However, here are some observations from the simulation: 1. The histogram of waiting times is **right-skewed**, meaning shorter waiting times are more frequent, but the tail (longer waits) increases the average. 2. When $n$ (the number of buses) is small, the average waiting time is closer to $\frac{n}{n+1} \cdot t$. For larger $n$, it approaches $t$, as demonstrated in the simulations. Ultimately, the video’s claim is correct: if buses arrive randomly every 10 minutes on average, the commuter must also wait, on average, 10 minutes—longer than what intuition might suggest. --- ## References 1. [Ryoji Arai](https://www.ryoji-arai.com/archives) 2. [【畢導】看了這個視頻，你會釋懷你倒霉的一生](https://www.youtube.com/watch?v=wS54Gsq_4sE) 3. [A simple proof of the bus paradox](https://hapax.github.io/mathematics/statistics/everyday/paradox-bus/)