# Big O (Metaphorical Uses of Mathematics)
(notes for the Tue, Apr 23 lecture of Algorithm Analysis 2024)
---
*Economic theory formulates thoughts via what we call “models.” The word model sounds more scientific than the word fable or tale, but I think we are talking about the same thing.* ([Ariel Rubinstein](https://library.oapen.org/bitstream/id/5429a087-8bb8-4cb5-9c1a-59954e2341fc/646736.pdf))
I am not denying that physics, computer science and engineering are more of a "hard science" (take this with a grain of salt) than economics, but I do believe that Ariel Rubinstein's insight is also true for much of the applications of mathematics in these areas. This lecture can be seen as a case study in support of this claim.
(Btw, if you dig around in [Ariel Rubinstein's publications](https://arielrubinstein.tau.ac.il/publications.html), you will find a lot of beautiful mathematics and economics.)
---
## Introduction
The definition of Big O is quite technical and it is not immediatly clear from the technical definition why it is so important.
In these notes I argue that the definition of Big O is less important for its mathematical content but rather because the precise mathematical definition creates a powerful **metaphor**. This metaphor, together with its countless variations and ramifications, underpins all of practical software engineering. But we never use the definition of Big O to crunch some numbers or establish benchmarks.
In the remainder of the lecture, I first state the definition of Big O and then explain its importance as a metaphor for software engineering. If I have time at the end, I will review some other examples of mathematical metaphors in software engineering. (In fact, we can see the previous two lectures on Turing machines and P vs NP in that new light as well.)
## Classifying Algorithms
The aim is to classify algorithms into broad clusters according to the amount of resources they consume. Most often, the resources we are interested in are time (runtime) or space (memory usage).
- Given a problem we want to distinguish algorithms that use less resources from those that use more resources.
- We also want to distinguish problems that can be solved using less resources from problems that can only be solving using more resources.
**Exercise/Activity:** Give examples that flesh out the two bullet points above.
Ok, so what can we do to work towards a classification of algorithms and a classification of problems?
Choose your favourite algorithm $A$.
Let $f$ be the function that measures how long the algorithm runs depending on the size of its input.
**Exercise/Activity:** Discuss the definition of $f$. What is needed to turn the statement above into a precise definition. Do you see any problems with the precise definition?
Now let us say we have two algorithms, $A$ and $B$ and two functions $f$ and $g$. $f$ measures the runtime of $A$ and $g$ measures the runtime of $B$.
**Exercise/Activity:** Which examples would interest you?
Classic examples come from sorting. For example, we may take
|A|B|
|:---:|:---:|
|bubble sort | insertion sort |
|insertion sort | merge sort |
| merge sort | quick sort |
**Exercise/Activity:** What can you say about the time complexity of these algorithms?
**Exercise/Activity:** Can we extract from the examples above a list of requirements that the mathematical definition of our classification should satisfy?
## Big O - Definition
We now change direction, instead of working from the problem (classification of algorithms) to the solution (Big O), we now start from the definition of Big O and then discuss whether it meets our requirements.
Given non-negative functions $f,g$ one writes[^writes]
$$f \in O(g)$$
[^writes]: Or sometimes $f(n)\in O(g(n))$ or also $f(n)=O(g(n))$.
if there are $N$ and $M$ such that
$$ f(n) \le M\cdot g(n)$$ for all $n\ge N$.
**Exercise/Activity:** Discuss whether this definition needs some more precision or side conditions. Take into account that we are interested in functions that measure resource usage depending on the size of the input.
## Big O - Properties
We now list some desirable properties of the Big O notation.
The following are immediate from the definition:
**Proposition**:
- $f\in O(f)$.
- If $c$ is a constant then $O(c\cdot f)=O(f)$.
- If $f\le g$ for large enough inputs then $O(f)\subseteq O(g)$.
- $O(f)\subseteq O(f+g)$.
If $h$ is a non-negative function on input sizes we also have:
**Proposition:** If $O(f)\subseteq O(g)$ then $O(f+ h)\subseteq O(g + h)$.
**Proposition:** If $O(f)\subseteq O(g)$ then $O(f\cdot h)\subseteq O(g\cdot h)$
We can use these basic properties to derive further useful properties (add assumptions if needed):
**Proposition:**
- $j\le k$ implies $O(n^j)\subseteq O(n^k)$.
- $j\le k$ implies $O(n^j+n^k)\subseteq O(n^k)$.
- $O(\sum_{i=0}^k a_in^i)=O(n^k)$.
- $O(\log n)\subseteq O(n)$.
- $O(n\log n)\subseteq O(n^2)$.
**Exercise:** Which relationship do we have between the following? Prove your claim.
- $O(n)$ and $O(\sqrt n)$
- $O(n^2)$ and $O(2^n)$
- $O(\log n)$ and $O(\log n \cdot \log n)$
**Remark**: I found that GPT4 does a good job to prove the propositions and to answer the exercise. I anyway recommend to find your own proofs and answers before checking with GPT4.
## Conclusion
We started with a quote from the economist Ariel Rubinstein and then looked at Big O, suggesting that Big O may be considered as a software engineering fable. I guess it is controversial, and possibly even unhelpful, to call Big O a fable. So I want to conclude with a caveat and further thoughts.
The distinction I want to make is between mathematics that is "operational" and mathematics that is not. By operational mathematics, I mean formulas or algorithms that will end up in the products we are engineering. Like graphics software that keeps solving quadratic equations or machine learning software that keeps multiplying matrices.
I also include in operational mathematics the formulas and algorithms that are in the software tools that we use to create the products. Compilers are a prime example (regular expressions, context free grammars, type theory, lambda calculus). Much of logic is also operational as witnessed by software tools such as SAT-solvers and model checkers.
I believe that the vast amount of operational mathematics is what distinguishes our field (software engineering) from most other non-engineering disciplines.
For example, mathematical models in economics (say mathematical models of the labor market) are typically not operational. They do not allow us to answer the question whether rising the minimum wage leads to more unemployment in the real world. That does not mean that these models are not useful.
Similarly, in engineering non-operational mathematics can also be useful. The example of the lecture was Big O. We cannot use Big O to predict how long an algorithm will run on a given input. So what is the use of non-operational, or metaphorical, mathematics for engineering? [^metaphorical]
## Acknowledgements
I am grateful to Daniel Kronovet who explained to me how he used information theory as a metaphor in designing his software [Chore Wheel](https://github.com/zaratanDotWorld/choreWheel). The story of the engineer who used calculus exactly once in his career was related to me by Drew Moshier.
## Appendix: metaphorical? conceptual?
As was pointed out to me by Nachoem Wijnberg, the use of "metaphorical" for non-operational mathematics can lead to severe misunderstandings as this term has a well-understood and different meaning in literature and linguistics. Looking for alternatives, phrases such as background theory, conceptual framework, analytical lens, or heuristic guide come to mind.
Maybe **conceptual mathematics** is a good choice.
([Conceptual Mathematics](https://archive.org/details/F.WilliamLawvereStephenH.SchanuelConceptualMathematicsAFirstIntroductionToCatego) is also the title of a book and while Lawvere and Schanuel do not mean exactly what I am after, there is some overlap.)
[^metaphorical]: As the concluding question emphasizes again, my preliminary distinction between operational and metaphorical mathematics takes an engineering point of view. It is less clear whether it makes sense as a distinction inside mathematics, even though it seems to be related to constructive vs non-constructive mathematics.