Leetcode 808. Soup Servings – Part 2

# 808. Soup Servings -- Part 2 This article is the successor of [Leetcode: 808. Soup Servings](https://hackmd.io/UbyuI8DeTNiL2dpFTOabXA#808-Soup-Servings). Please refer to it for the problem definition. In addition to the two approaches discussed in the previous article, this article aims to explore a more efficient method. The objective is to obtain a **closed-form solution for $p_A^{\text{root}}$ and $p_B^{\text{root}}$** directly, without iterating through all the nodes in the graph. # Pascal Triangle and Binomial Expansion ## Intuition Initially, let's examine a simplified version of this problem where each non-leaf node has 2 children. In this scenario, we can compute the Unique Path Count (UPC) using Pascal's triangle. As the values in Pascal's triangle possess a closed-form solution, we can directly determine the UPC for the leaves without the need to iterate through the entire graph. In this simplified scenario, only two valid operations are considered: 1. Serve 4 units of soup A and 0 unit of soup B, 2. Serve 1 unit of soup A and 3 unit of soup B. We designate $n_A$ and $n_B$ as the present remaining volumes of soup for types A and B, respectively. For a given $n_A$ and $n_B$, we denote $p_A$ and $p_B$ as the probabilities that **soup A will exhaust first** or that **A and B will deplete simultaneously**. In the problem statement, the initial state of soups $(n_a,n_b)$ is $(n,n)$. We can deduce the expansion of the state graph $(n_a,n_b)$, as depicted below. ![image](https://hackmd.io/_uploads/ryWdhn9Op.png) We can observe that the expansion graph forms a Pascal triangle. To calculate the *Unique Path Count (UPC)*, we introduce the concept of *depth* for a node, which starts from $(n,n)$ with a depth of zero. For each subsequent level below, the *depth* increases by $1$. Additionally, we define the *order* of a node, which begins from the leftmost node in a given level with an *order* of zero. For each node to the right of this node, the *order* increases by 1, as illustrated below. ![image](https://hackmd.io/_uploads/BJq-S-n_T.png) For a node situated at $depth=d$ and $order=r$, we can apply the binomial expansion of $(x^0+x^1)^d$. The *Unique Path Count (UPC)* for this node is then determined as the coefficient of the term $x^r$, which is $\frac{d!}{r!(d-r)!}x^r$. $$ (x^0+x^1)^d = \frac{d!}{0!(d-0)!}x^0+\frac{d!}{1!(d-1)!}x^1+\cdots+\frac{d!}{r!(d-r)!}x^r+\cdots+\frac{d!}{d!(d-d)!}x^d $$ ## Approach We represent $p_A^{\text{root}}$ and $p_B^{\text{root}}$ as the values of $p_A$ and $p_B$ at the initial state $(n,n)$, respectively. To derive the closed-form solution for these values, let's first examine the cases when $n=12k$, where $k$ is a positive integer. After expanding the state diagram, we obtain: ![image](https://hackmd.io/_uploads/SyURJz3Oa.png) In this graph, the leaf nodes with $n_a\leq 0, n_b > 0$ are highlighted in purple, those with $n_a \leq 0, n_b \leq 0$ are highlighted in orange, and those with $n_a > 0, n_b \leq 0$ are highlighted in yellow. For computational convenience, expanding the left-hand-side leaf nodes and right-hand-side leaf nodes is equivalent to the following Pascal triangle:   ![image](https://hackmd.io/_uploads/B1isKrT_p.png) In this Pascal triangle, we can directly use the closed-form solution to calculate the *Unique Path Count (UPC)* for each leaf node. As introduced in the last article, the probability $p_X^{\text{root}}, X=\{A,B\}$ is the sum of the probability $p_X$ at each leaf times the UPC of each leaf: $$ p_X^{\text{root}} = \sum_{i} c_i p_{X}^i, $$ where $c_i$ is the UPC at leaf $i$ and $p_X^{i}$ represents the $p_X$ at node $i$. Furthermore, it is straightforward that the depth of node $(-12k,12k)$ is $d=6k$, and its order is $r=0$. For node $(-3,3)$ the depth is $d=6k$ and the order is $r=4k-1$. Hence, we have: $$ p_A^{\text{root}} = \sum_{r=0}^{4k-1}\frac{(6k)!}{r!(6k-r)!} 0.5^{6k}, $$ where $0.5^{6k}$ is the value of $p_A$ at each leaf. To calculate $p_B^{\text{root}}$, there is only one node $(0,0)$ satisfying $n_a \leq 0, n_b \leq 0$. Additionally, there is only one edge connecting $(0,0)$, which is from $(1,3)$. The UPC of $(0,0)$ is equivalent to the UPC of $(1,3)$. The depth of $(1,3)$ is $d=6k-1$, and its order is $r=4k-1$. Thus: $$ p_B^{\text{root}} = \frac{(6k-1)!}{(4k-1)!(2k)!}0.5^{6k}. $$   The closed-from solution in the cases of $n \text{ mod } 12 \neq 0$ can be derived by the similar approach, which is omitted in this article. ## Code The implementation of this closed-form solution is shown below: ```python from math import factorial def soupServingsBinaryClosedForm(n): # here we only consider n = 12k assert n%12 == 0 k = int(n/12) # closed-form solution of pA_root, pC_root and pB_root pA_root = sum(factorial(6*k) / (factorial(r) * factorial(6*k - r)) * 0.5**(6*k) for r in range(4*k)) pB_root = factorial(6*k -1 ) / \ (factorial(4*k - 1) * factorial(2*k)) * 0.5**(6*k) return pA_root, pB_root ``` We also rewrite the method in our previous article [Leetcode: 808. Soup Servings](https://hackmd.io/UbyuI8DeTNiL2dpFTOabXA#808-Soup-Servings), which iteratively go through each level to count the UCPs. This method is rewritten for the simplified problem as follows: ```python def soupServingsBinaryIterative(n): # trivial cases if n == 0: return 0.5 # Initialize. nA, nB = n, n pA_root, pB_root = 0, 0 d = 0 # N is to store (nA, nB), and C is the array to stor c. N = [(nA,nB)] C = [1] # If there is any non-leaf node (nA,nB), continue for the next level. while nA+nB > 0 : # Add padding to the C array for convolution with kernel size=4. C_p = [0]+C+[0] # to store nA, nB and cat the next level. N_next = [] C_next = [] debug_all = [] # We reach a new level; hence, depth += 1. d += 1 # Compute nA, nB and c for each index i. for i in range(len(C)+1): # Step 1. compute nA, nB nA = N[0][0]-4+i*3 nB = N[0][1]-i*3 # Step 2. compute c c = sum(C_p[i:i+2]) debug_all.append((nA,nB)) # Step 3. compute pA and pB, and add them to pA_root and pB_root # Case 1: The current node is a leaf, and soup A empty first. if nA <= 0 and nB > 0: pA_root += c*(0.5**d) # Case 2: The current node is a leaf, and soup B empty simultaneously. elif nA<=0 and nB <= 0: pB_root += c*(0.5**d) # Case 3: The current node is a leaf, and soup B empty first. elif nA > 0 and nB <= 0: pass # Case 4: The current node is not a leaf, and we need to compute for its children. else: # Step 4. collect non-leaf node for the next level C_next.append(c) N_next.append((nA, nB)) # Repeat the same steps until nA+nB <= 0. C = C_next N = N_next return pA_root, pB_root ``` We use the following experiment to show the result of $p_A^{\text{root}}, p_B^{\text{root}}$ and runtime of both methods ```python import time for k in [1, 2, 5, 10, 20, 50, 100]: n = k*12 print("n = %s"%n) t = time.time() print("Closed-form: pA_root = %.5f, pB_root = %.5f, runtime = %.5f"% (*soupServingsBinaryClosedForm(n), time.time() - t) ) t = time.time() print("Iterative: pA_root = %.5f, pB_root = %.5f, runtime = %.5f"% (*soupServingsBinaryIterative(n), time.time() - t)) ``` The result show that both method produce consistent value of $p_A^{\text{root}}$ and $p_B^{\text{root}}$, while Closed-form approach are much efficient than iterative approach. ``` n = 12 Closed-form: pA_root = 0.65625, pB_root = 0.15625, runtime = 0.00004 Iterative: pA_root = 0.65625, pB_root = 0.15625, runtime = 0.00005 n = 24 Closed-form: pA_root = 0.80615, pB_root = 0.08057, runtime = 0.00002 Iterative: pA_root = 0.80615, pB_root = 0.08057, runtime = 0.00011 n = 60 Closed-form: pA_root = 0.95063, pB_root = 0.01865, runtime = 0.00006 Iterative: pA_root = 0.95063, pB_root = 0.01865, runtime = 0.00050 n = 120 Closed-form: pA_root = 0.99326, pB_root = 0.00242, runtime = 0.00017 Iterative: pA_root = 0.99326, pB_root = 0.00242, runtime = 0.00174 n = 240 Closed-form: pA_root = 0.99983, pB_root = 0.00006, runtime = 0.00055 Iterative: pA_root = 0.99983, pB_root = 0.00006, runtime = 0.01119 n = 600 Closed-form: pA_root = 1.00000, pB_root = 0.00000, runtime = 0.00397 Iterative: pA_root = 1.00000, pB_root = 0.00000, runtime = 0.04460 n = 1200 Closed-form: pA_root = 1.00000, pB_root = 0.00000, runtime = 0.02356 Iterative: pA_root = 1.00000, pB_root = 0.00000, runtime = 0.19261 ``` ![圖片](https://hackmd.io/_uploads/HJKovHC_6.png) # Generalized Pascal Triangle ## Intuition To address the original problem statement of Leetcode 808, we must expand the Binomial Expansion of Pascal's Triangle into the Multinomial Expansion of the Generalized Pascal Triangle. In the state graph of this problem, each non-leaf node has $4$ children, and their connections and UPC are illustrated below. ![圖片](https://hackmd.io/_uploads/BJE2iKTdp.png) In this Pascal Triangle, we can employ multinomial expansion to obtain the closed-form solution of the UPC of a given node if we know its depth $d$ and order $r$, similar to the procedure used for the binary Pascal Triangle. ![圖片](https://hackmd.io/_uploads/H1Lr5q6_a.png) For a node located at $depth=d$ and $order=r$, we can use the polynomial expansion of $(x^0+x^1+x^2+x^3)^d$. The UPC of this node is the coefficient of the term $x^r$, denoted as $c_r$, expressed as: $$ (x^0+x^1+x^2+x^3)^d = c_0x^0+c_1x^1+\cdots+c_rx^r+\cdots+c_{3d-1}x^{3d-1}+\cdots+c_{3d}x^{3d}. $$ Given depth $d$ and order $r$, to derive the closed-form of $c_r$, we define $r_0$, $r_1$, $r_2$, $r_3$, and $\mathcal{R}_r^d$, representing the solution set of counts of $x^0$, $x^1$, $x^2$, and $x^3$ such that $$ \mathcal{R}_r^d = \Big\{(r_0,r_1,r_2,r_3)\Big|r_0+r_1+r_2+r_3 = d, r_1 + 2r_2+3r_3 = r, \text{ and } r_0,r_1,r_2,r_3 \in \mathbb{N}\cup \{0\} \Big\}. $$ Since $(x^0)^{r_0}(x^1)^{r_1}(x^2)^{r_0}(x^3)^{r_3}$ is expanded from $(x^0+x^1+x^2+x^3)^d$, and hence $r_0+r_1+r_2+r_3 = d$, and $(x^0)^{r_0}(x^1)^{r_1}(x^2)^{r_0}(x^3)^{r_3} = x^r$, so $r_1+2r_2+3r_3 = r$. Furthermore, these two equations can be rearranged as $$ \begin{cases} r_0 = d - r + r_2 + 2r_3 \\ r_1 = r - 2r_2 - 3r_3. \\ \end{cases} $$ To obtain the coefficient $c_r$, we sum over the combinations of $r_0$, $r_1$, $r_2$, and $r_3$ in the solution space $\mathcal{R}_r^d$: $$ c_r = \sum_{(r_0,r_1,r_2,r_3)\in\mathcal{R}_r^d} \frac{d!}{r_0!r_1!r_2!r_3!}. $$ For example, we calculate $c_r$ for a given $d=4$, $r=5$, as shown below. ![圖片](https://hackmd.io/_uploads/S1k6yoTu6.png) To calculate $c_r$, we first find the possible values of $r_0, r_1, r_2, r_3$ that satisfy the following system of equations: $$ \begin{cases} r_0 = -1 + r_2 + 2r_3 \\ r_1 = 5 - 2r_2 - 3r_3 \end{cases} $$ The solutions are given by: $$ (r_0, r_1, r_2, r_3) \in \mathcal{R}_5^4 = \{(0,3,1,0),(1,1,2,0),(1,2,0,1),(2,0,1,1)\}. $$ Then, we calculate the result $c_r = 40$ using the formula: $$ c_r = \sum_{(r_0, r_1, r_2, r_3)\in\mathcal{R}_5^4} \frac{d!}{r_0!r_1!r_2!r_3!}. $$ Therefore, we have: $$ \begin{aligned} c_r &= \frac{4!}{0!3!1!0!} + \frac{4!}{1!1!2!0!} + \frac{4!}{1!2!0!1!} + \frac{4!}{2!0!1!1!} \\ &= 4 + 12 + 12 + 12 \\ &= 40. \end{aligned} $$ ## Approach The approach to solve the original Leetcode 808 problem is similar to the method used to solve the simplified problem in this article. The only difference lies in the fact that each non-leaf node has 4 children, requiring the use of the generalized Pascal Triangle to calculate the Unique Path Count (UPC). Here, we discuss the cases where $n$ is even first, and the cases where $n$ is omitted. For $n$ is even (i.e., $n=2k$), the state graph is illustrated below. ![圖片](https://hackmd.io/_uploads/Hyf2p6R_p.png) In this scenario, the leaf nodes satisfying $n_A \leq 0, n_B > 0$ have a depth of $2k$ and orders ranging from $0$ to $2k-1$. However, for the node satisfying $n_A \leq 0$ and $n_B \leq 0$, we cannot directly calculate its UPC due to a missing edge $(4,0)$ connecting $(0,0)$. The UPC of $(0,0)$ is equivalent to the sum of the UCP of $(1,3)$, $(2,2)$, and $(3,0)$, where their orders range from $2k-3$ to $2k-1$, and their depth is $2k-1$. Hence, the closed-form solutions for $p_A^{\text{root}}$ and $p_B^{\text{root}}$ are as follows: $$ \begin{cases} p_A^{\text{root}} = \sum_{r=0}^{2k-1} \sum_{(r_0,r_1,r_2,r_3)\in \mathcal{R}_r^k }\frac{k!}{r_0!r_1!r_2!r_3!} 0.25^k, \\ p_B^{\text{root}} = \sum_{r=2k-3}^{2k-1}\sum_{(r_0,r_1,r_2,r_3) \in \mathcal{R}_r^{k-1} }\frac{(k-1)!}{r_0!r_1!r_2!r_3!} 0.25^k. \end{cases} $$  ## Code The implementation of this closed-form solution is shown below: ```python import math def soupServingsClosedForm(n): # Calculate the UPC c_r def calculate_cr(d,r): cr_sum = 0 max_r2 = min(r // 2, d) for r2 in range(max_r2 + 1): max_r3 = min((r - 2 * r2) // 3, d - r2) for r3 in range(max_r3 + 1): r0 = d - r + r2 + 2 * r3 r1 = r - 2 * r2 - 3 * r3 if r0 >= 0 and r1 >= 0: cr_sum += math.factorial(d) // \ (math.factorial(r0) * math.factorial(r1) * math.factorial(r2) * math.factorial(r3)) return cr_sum k = int(math.floor(n/2)) # calculate pA_root and pB_root pA_root = sum([calculate_cr(k,r) for r in range(2*k)])*(0.25**k) pB_root = sum([calculate_cr(k-1,r) for r in range(2*k-3, 2*k)])*(0.25**k) return pA_root, pB_root ``` We also rewrite the method in our previous article, which iteratively go through each level to count the UCPs. ```python def soupServingsIterative(n): # trivial cases if n == 0: return 0.5 # It can be shown that pA approach to 1 when n is large. if n > 300: return 1 # Initialize. nA, nB = n, n pA_root, pB_root = 0, 0 d = 0 # N is to store (nA, nB), and C is the array to stor c. N = [(nA,nB)] C = [1] # If there is any non-leaf node (nA,nB), continue for the next level. while nA+nB > 0 : # Add padding to the C array for convolution with kernel size=4. C_p = [0,0,0]+C+[0,0,0] # to store nA, nB and cat the next level. N_next = [] C_next = [] # We reach a new level; hence, depth += 1. d += 1 # Compute nA, nB and c for each index i. for i in range(len(C)+3): # Step 1. compute nA, nB nA = N[0][0]-4+i nB = N[0][1]-i # Step 2. compute c c = sum(C_p[i:i+4]) # Step 3. compute pA and pB, and add them to pA_root and pB_root # Case 1: The current node is a leaf, and soup A empty first. if nA <= 0 and nB > 0: pA_root += c*(0.25**d) # Case 2: The current node is a leaf, and soup B empty simultaneously. elif nA<=0 and nB <= 0: pB_root += c*(0.25**d) # Case 3: The current node is a leaf, and soup B empty first. elif nA > 0 and nB <= 0: pass # Case 4: The current node is not a leaf, and we need to compute for its children. else: # Step 4. collect non-leaf node for the next level C_next.append(c) N_next.append((nA, nB)) # Repeat the same steps until nA+nB <= 0. C = C_next N = N_next # Return the result requested from the problem definition. return pA_root, pB_root ``` We use the following experiment to show the result of $p_A^{\text{root}}, p_B^{\text{root}}$ and runtime of both methods ```python import time for k in [1, 2, 5, 10, 20, 50, 100]: n = 2*k print("n = %s"%n) t = time.time() print("Closed-form: pA_root = %.5f, pB_root = %.5f, runtime = %.5f"% (*soupServingsClosedForm(n), time.time() - t) ) t = time.time() print("Iterative: pA_root = %.5f, pB_root = %.5f, runtime = %.5f"% (*soupServingsIterative(n), time.time() - t)) ``` The result show that both method produce consistent value of $p_A^{\text{root}}$ and $p_B^{\text{root}}$. **However, the Closed-form approach are less efficient than iterative approach. This is due to the calculation of $c_r$ involving finding the integer solution of an linear equation set.** We only use brute-force method to implement this part. How to improve the efficient of solving $c_r$ may be discussed in the next article. ``` n = 2 Closed-form: pA_root = 0.50000, pB_root = 0.25000, runtime = 0.00004 Iterative: pA_root = 0.50000, pB_root = 0.25000, runtime = 0.00002 n = 4 Closed-form: pA_root = 0.62500, pB_root = 0.18750, runtime = 0.00004 Iterative: pA_root = 0.62500, pB_root = 0.18750, runtime = 0.00002 n = 10 Closed-form: pA_root = 0.78320, pB_root = 0.08887, runtime = 0.00013 Iterative: pA_root = 0.78320, pB_root = 0.08887, runtime = 0.00006 n = 20 Closed-form: pA_root = 0.89734, pB_root = 0.03801, runtime = 0.00037 Iterative: pA_root = 0.89734, pB_root = 0.03801, runtime = 0.00010 n = 40 Closed-form: pA_root = 0.97172, pB_root = 0.00970, runtime = 0.00206 Iterative: pA_root = 0.97172, pB_root = 0.00970, runtime = 0.00027 n = 100 Closed-form: pA_root = 0.99911, pB_root = 0.00029, runtime = 0.04577 Iterative: pA_root = 0.99911, pB_root = 0.00029, runtime = 0.00188 n = 200 Closed-form: pA_root = 1.00000, pB_root = 0.00000, runtime = 0.55728 Iterative: pA_root = 1.00000, pB_root = 0.00000, runtime = 0.00733 ``` ![圖片](https://hackmd.io/_uploads/BJzR1pktp.png)