We've talked some about making programs more readable and concise, but we haven't talked much yet about making programs fast (or efficient in other ways). Today, we’ll start to learn how to reason carefully about program performance.
Here’s a Python program:
What's the running time of the count_the
function? One way to find out would be to actually time an execution on some real data. We could load in Frankenstein again, call the function on the list that split()
returns, and actually time it.
That went by far faster than we could actually count with a stopwatch. But the approach has other flaws too. What are they?
Some of the time will be taken up by starting the program, loading the file into memory, and other operations that have nothing to do with the runtime of count_the
.
Our stopwatch only tells us the runtime for a specific computer, under a specific existing workload, in a specific environment. There will also be some randomness involved.
This tells us how long the program takes on Frankenstein, but what about other inputs? How is the function likely to scale as the books get bigger? We can't run the function on all possible books, so how can we really predict its performance overall?
The truth is that, when we talk about how time-efficient a program is, there are two very different things we might mean.
(1) How long is the program taking to run on a particular input, or on a set of inputs?
(2) How well does the program scale as the input becomes larger or more complex?
Which question you're asking depends on what problems you're trying to solve. For today, we'll focus on the second question: how well does a program scale?
Let’s see if we can figure out a rough estimate of how long it takes to run count_the
without actually running it. We can start by counting the "basic operations," like adding numbers or assigning to a variable, executed in the function:
So for a list of length L, the program executes (3 * L) + 3 operations.
This might sound odd. Not all operations take the same amount of time (and there's a fair bit of variation across different computers). We are losing a bit of precision, but we're gaining generality. Here's what I mean: in general, what is going to dominate the runtime of the program: the setup, or the loop? And, as inputs get bigger, what's going to dominate the runtime: the number of times the loop runs, or the time it takes to run the loop?
If we're really concerned with how our function's performance scales as inputs get bigger, the exact details of how long an operation takes often become irrelevant. Instead, we'll say that the running time of count_the
is proportional to (2 * L) + 3.
Notice how many assumptions we’re making here! For one, we’re not counting the for-loop itself as an operation; do we need to? Python’s operations probably don't actually take the same amount of time; does it matter?
For some purposes, the answers to these questions absolutely matter. Some code is run inside Google every time a user does a web search. The exact running time of that code is very important to Google! But if you're trying to pick a data structure or an algorithm (more on this later) concrete runtime is less important than how your choice will scale.
In fact, even this much precision is more than we’ll generally use going forward! It's hard to be sure about how long specific operations take: if returning from a function takes a while, then the real runtime could be more like (2 * L) + 100. If addition takes a while it could be (50 * L) + 3. So we'll focus on the most important value here: the list length. We'll say that our function's running time is linear in L. That is, it does some constant amount of work for each element of L, and an additional constant amount of work regardless.
Next class we'll introduce some commonly-used notation for expressing this idea. But before that, let's look at a couple more functions.
How about this one?
How long does this function take, for a list of length L?
The running time of member depends on its arguments even more than the running time of count_the
did!
Best Case: If we call it on a list whose first element is ele
, it really doesn’t matter how long the list is. It's going to check the first element and no more, so it'll take the same, constant, amount of time regardless of how long the list is. We can express this by saying that the best case time is constant.
Computer programming is an engineering discipline, though. If I'm building a bridge, I'm probably not all that interested in the maximum weight it can tolerate under ideal conditions! What we're usually interested in (though we’ll see some exceptions, later in the course) is worst-case running time: how long will the function take on the worst possible input or inputs?
Worst Case: If ele
isn’t actually in l
, then the loop in is_member
needs to traverse the entire list. As with count_the
, we'll say that the function runs time linear in the length of l
.
We’ll take a look at one last function:
This one is a bit more complicated. The for loop runs once for every element in the list, but how long does every iteration of the loop take? To figure that out, we need to learn something about how lists function in Python.
In Python, x not in seen
is basically equivalent in to calling our previous example function. To check membership in a list, Python will walk the list just like our function did. But that means that every time the for
loop in distinct
runs, that single iteration takes linear time in the length of seen
.
Best Case: What’s the best case? Well, if all of the elements in l
are the same, x not in seen
never takes more than a constant number of operations. So in that case, the function runs in linear time.
Worst Case: How about the worst case? What if the elements are all different?
If all the elements are different, then every x
will be new, and thus won't be found in the seen
list. But then the size of the seen
list grows every time the loop iterates: at first it's 0, then 1, then 2, and so on.
Some geometric intuition might be useful here:
How many blocks are there in this diagram? (0 + 1 + 2 + ... + (len(l) - 1))
of them. With a little algebra, this simplifies to (L * (L - 1))/2
. Which means that the function will scale with the square of the length of the input. We'll call this quadratic time.
How much worse is this? Does it actually matter in practice? Let's find out. We'll time how long it takes to runcount_the
on the text of Frankenstein, and how long it takes to run distinct
on the same input.
The difference is actually quite noticeable even without a timer: 30 milliseconds versus over 2 full seconds. Nearly 2 orders of magnitude.
What if we tried on a larger book? I've downloaded the full text of an English translation of Mallory's The Death of Arthur, which is a bit longer:
(That's 78122 words versus 368286 words.)
Now it's an even bigger gap! 80 milliseconds versus over 8 seconds. And it'll only get worse as the data gets bigger, unless we get "lucky" and the input is closer to the best case.
(If time permits) It's useful to consider the ways that runtime matters in the world. People often point out that a faster program uses less electricity and therefore has a lower climate impact. While this is true, I personally tend to be skeptical of generalizing this, since faster execution also makes room for running more programs.
Sure, there are lots of business-focused reasons to have fast code, but there is at least one social factor: access. When building software, we don't know what kind of computer our eventual users will have. A laptop that a highly-paid developer in San Diego or New York might consider "average" could be a powerful and expensive luxury for others.