How to Do a Worst-Case Run-Time Analysis

--- tags: resources, s22 --- # How to Do a Worst-Case Run-Time Analysis We determine the worst-case running time of a program by totaling the costs associated with each operation in the program. The costs of operations follow a set of rules, which we outline first, followed by rules for combining costs. ## Costs of individual operations - Accessing a variable (in the environment) takes 1 step - Arithmetic (addition, multiplication, etc) and boolean operations (comparisons, and, or, etc) each take 1 step - Calling a function takes 1 step - Updating the value of a variable in the environment takes 1 step (for =), plus the time to compute the new value - Instantiating a datatype (i.e. `link(first, rest)` in Pyret) takes one step, plus the time to compute each field - Updating the value stored in a field in the heap takes 1 step for each location you have to follow to get to the right spot, plus the time to compute the new value. For example, - `myacct.balance = 100` takes 3 steps: 1 to access `myacct`, 1 to get to the `balance` location, and 1 for the `=`. There is no cost for using a number, boolean, or string - `custA.account.balance = 100` would take 4 steps: the extra step compared to the previous example is because we follow two locations (one per `.`) as compared to one in the previous example - Walking through all elements of a list (whether with recursion or a for-loop) takes a number of steps dependent on the length of the list. Since this isn't known in advance, we introduce a variable for this size. Typically, we introduce a fresh variable for the size of each input whose size can vary. For example, if a function takes a list, we can give the length of the list the variable "N. ## Costs of built-in functions - `map` and `filter` each take time proportional to the length of the list times the work done in the lambda function. - `length` on a list takes constant time (not proportional, for reasons we will see in week 3) - `append` on lists takes time proportional to the total lengths of the two lists - For our purposes, mathematical operations, such as `num-modulo`, take time 1 to run, plus the cost of the function call and the cost to compute/access the inputs. ## Combining costs Four rules guide how to combine costs for individual programming constructs into costs for an entire program: - if two **statements/expressions occur sequentially** (one after the other), add their costs - for an **`if/else` statement**, add the cost of computing the question to the larger cost of the `if` and `else` answers. - A **cases expression** in Pyret is similar. We add the cost of matching the worst-case variant/"shape" to the cost of the worst-case answer. E.g., if the worst case is matched to the List variant/"shape" `link(first, last)`, the cost of matching would be 1 (cost of asking "is this link") + 2 (cost of assigning both first and last), and we would add that to the cost of computing the answer that Pyret produces. - for a **loop**, multiply the size of the data (number of iterations) by the work that is done we do on each individual element - for a **recursive function**, we write the time it takes to run the function on an input of size N in terms of the time it takes to run the non-recursive code and the time it takes to run the recursive call (in terms of the recursive call's input size). For instance, if we say that the time it takes to run the function is T(N), and the recursive call works on an input of one smaller, we would write T(N) in terms of T(N+1). We call this a **recurrence relation.** We can solve for the closed form of the time, or we can leave it as a recurrence relation. ### Keeping it simple: focus on the worst case In practice, we can't predict the exact number of steps that will be taken in a program without knowing the exact input. For example, consider a program to check whether an item is in a list: ``` for elt in some_list: if (elt == data_to_find) return True return False ``` How many times will we perform the comparison check? It depends on where (if anywhere) the `data_to_find` is in the input list. How do we predict the cost of this code then? We will make a **worst-case assumption** that the `data_fo_find` is not in `some_list`, and hence we will check every element. This is a sufficient approximation for many settings, and it is a more manageable way to get started thinking about costs. --- ### Annotating programs for run-time: for-loop in Java Here's a concrete example of annotating each line of a program with its costs: ``` count = 0 # 1 (assign) for elt in some_list: # N if (max(elt, 10) == 10) # 1 (access elt) + 1 (max) + 1 (==) count = count + 1 # 1 (access count) + 1(+) + 1 (assign) return count # 1 (access count) + 1 (return) ``` Then, we total up the costs: ``` 1 + (N * (3 + 3)) + 1 = 6N + 2 ``` ### Annotating programs for run-time: recursion in Pyret :::info Updated 2/5 1:23 pm: accessing the variable some-list in cases costs 1 operation. The example has been amended to add this. ::: Here is some code doing the same computation, written using recursion in Pyret. Because it is recursive, we write it as a recurrence relation. Let's call the cost of running `r-fun` on an input of size `N` to be `T(N)`. Then, ``` fun r-fun(some-list :: List): cases(List) some-list: # 1 access | empty => 0 # not the worst case | link(fst, rst) => # 1 (match link) + 1 (assign fst) + 1 (assign rst) if num-max(fst, 10) == 10: # 1 (access fst) + 1 (num-max) + 1 (==) 1 + r-fun(rst) # 1 (+) + 1 (access rst) + 1 (function call) + T(N-1) (run r-fun on rst) else: r-fun(rst) # not the worst case end end end ``` Then, we total up the costs and write it as a recurrence: ``` T(N) = 1+ 3 + 3 + 3 + T(N-1) = 10 + T(N-1) ``` --- ## Examples of recurrences We can start to see the pattern in recurrence relations by assuming that `T(0) = C`, that is, the cost of running a recursive function on an input of size 0 will be some constant cost C. Then, we can start to write down some iterative examples for common recurrence relations: ### T(N) = B + T(N-1) We have ``` T(0) = C T(1) = B + T(0) = B + C T(2) = B + T(1) = B + (B + C) = (2 * B) + C T(3) = B + T(2) = B + ((2 * B) + C) = (3 * B) + C T(4) = B + T(3) = B + ((3 * B) + C) = (4 * B) + C ``` The pattern that emerges is that `T(N) = (N * B) + C`. ### T(N) = B + A*N + T(N-1) We have ``` T(0) = C T(1) = B + A*1 + T(0) = B + A + C T(2) = B + A*2 + T(1) = 2 * B + 3 * A + C T(3) = B + A*3 + T(2) = 3 * B + 6 * A + C T(4) = B + A*4 + T(3) = 4 * B + 10 * A + C ``` It is a little harder to see the pattern here, but it turns out to be ``` T(N) = N * B + (N^2 + N)/2 * A + C ``` ### T(N) = B + T(N/2) Let's assume that `T(1)` takes some time `K`. Then, ``` T(2) = B + T(1) = B + K T(4) = B + T(2) = 2 * B + K T(8) = B + T(3) = 3 * B + K T(16) = B + T(4) = 4 * B + K ``` The pattern is `T(N) = log2(N) * B + K`, where `log2(N)` is the base-2 logarithm of N. --- ## When is one cost better than another? After using the rules to compute costs, we are left with cost summaries along the lines of - $N * 2 + 4$ - $N * 2 + 7$ - $N + N + N + 8$ - $N * (N + 3) + 2$ - etc When should we say that one cost is worse than another? Technically, the costs in these expressions are getting worse from the first to the last expression. But do we really care about a difference of 3 operations (as between the first two) in the context of processing a large list? If the list were large enough, the cost of visiting every element are clearly more impactful than an extra operation here and there. In practice, we classify program costs by the nature of the function from the size of the input to the cost. Let's recast our four examples as functions named $T$ (for "time"): - $T1(N) = N * 2 + 4$ - $T2(N) = N * 2 + 7$ - $T3(N) = N + N + N + 8$ - $T4(N) = N * (N + 3) + 2$ If we were to graph these functions over increasing values of `N`, we'd find that the first three yielded straight-line graphs (constant slopes), while the fourth curves upward over time. Back in high school, you learned terminology for functions based on the shapes of their graphs, such as "linear", "quadratic", and "exponential". As a reminder, here's a pictorial view: ![](https://i.imgur.com/wrNK86V.png) *Linear (red), quadratic (blue), and exponential (green) growth [image from [Wikipedia](https://en.wikipedia.org/wiki/Exponential_growth)]* In computer science, we typically describe the running times of programs in these terms. For example, for a list-membership function, we might say - `member` is linear or - `member` is in $O(N)$ (where $O(N)$ is the set of all functions whose growth is linear in the size of $N$.) Specifically, the terms you will see in computer science are: - constant -- $O(1)$ - logarithmic -- $O(log N)$ - linear -- $O(N)$ - $O(N * log N)$ [serves as its own name] - quadratic -- $O(N^2)$ - exponential -- $O(2^N)$ *[For those feeling rusty on logarithmic growth, it is the inverse of exponential, and grows much more slowly than linear. For a graphical representation, look at the top-left red/blue/green graph at this [Wikipedia page on logarithmic scales](https://en.wikipedia.org/wiki/Logarithmic_scale) (about half way down on the right).]* When we compare algorithms for time, the higher in this list an algorithm is classified in, the faster it will be as data get large. We don't worry about time comparisons for small data (they'll be fast enough because the data are small). We worry about larger data.