# The hiring problem You are using an employment agency to hire a new office assistant. The agency sends you one candidate each day. You interview the candidate and must immediately decide whether or not to hire that person. But if you hire, you must also fire your current office assistant if it is someone you have recently hired. * **Cost to interview** per candidate: $c_i$ * **Cost to hire** per candidate: $c_h$ We can assume $c_h > c_i$. You are committed to having hired, at all times, the best candidate seen so far. This means that whenever you interview a candidate who is better than your current office assistant, you must fire the current office assistant and hire the candidate. Since you must have someone hired at all times, you will always hire the first candidate that you interview. **Goal:** Determine what the price of this strategy will be. **Algorithm in pseudo-code:** ``` HIRE-ASSISTANT(n) best ← 0 // candidate 0 is a least-qualified dummy candidate for i ← 1 to n do interview candidate i if candidate i is better than candidate best then best ←i hire candidate i ``` **Cost:** If $n$ candidates, and we hire $m$ of them, the cost is $\mathcal O(nc_i +mc_h)$. * Have to pay $nc_i$ to interview, no matter how many we hire. * So we focus on analyzing the hiring cost $mc_h$. * $m c_h$ varies with each run---it depends on the order in which we interview the candidates. * This is a model of a common paradigm: we need to find the maximum or minimum in a sequence by examining each element and maintaining a current "winner". The variable $m$ denotes how many times we change our notion of which element is currently winning. **Worst-case analysis:** In the worst case, we hire all $n$ candidates. This happens, if each next candidate is better than the previous one. Then the cost is $\mathcal O(nc_i + nc_h) = \mathcal O(nc_h)$, since $c_h > c_i$. **Probabilistic analysis:** We do not in general know in which order the applicants appear, hence we must use knowledge or make assumption over the probability distribution of inputs. Assume that the candidates arrive in a random order. Let $X$ be a random variable that equals the number of times we hire a new office assistant. Define so-called **indicator random variables** $X_1, X_2,\ldots , X_n$, where $X_i = \mathbb{I}{\{\text{candidate } i \text{ is hired}\}}$. We need to compute $\mathbb P(\text{candidate } i \text{is hired})$. Candidate $i$ is hired if and only if candidate $i$ is better than each of candidates $1,2,\ldots ,i−1$. We assume that the candidates arrive in random order. That means any one of the first $i$ candidates is equally likely to be the best one so far. Thus, $\mathbb P (\text{candidate } i \text{ is the best so far})=1/i$, so $\mathbb E[X_i]=1/i$. We can compute: \begin{align} \mathbb E[X] & = & \mathbb E \left[ \sum_{i=1}^n X_i\right] \\ & = & \sum_{i=1}^n \mathbb E[X_i] \\ & = & \sum_{i=1}^n \frac{1}{i} \\ & \leq & \sum_{i=0}^{\lceil \log_2 n \rceil} \sum_{j = 0}^{2^i - 1} \frac{1}{2^i + j} \\ & \leq & \sum_{i=0}^{\lceil \log_2 n \rceil} \sum_{j = 0}^{2^i - 1} \frac{1}{2^i} \\ & = & \sum_{i=0}^{\lceil \log_2 n \rceil} 1 \\ & \leq & \log_2 n + 1 \end{align} Thus, the expected hiring cost is $\mathcal O(c_h \cdot \log_2 n)$, which is much better than the worst-case cost of $\mathcal O(n \cdot c_h)$.