RX-APSP "Onboarding"

# RX-APSP Notes # The goal of this project is to study a randomized approximation algorithm for solving the all-pairs shortest paths (APSP) problem. In APSP, you are given a graph and you would like to find, for every pair of vertices, the shortest path that connects them and/or the length of such a path. At Georgia Tech, you would most likely encounter the APSP problem in a class like [CS 3510](http://www.cs3510.com), where you "meet" the classic textbook algorithm for it known as the Floyd-Warshall algorithm. But whereas FW calculates exact paths and path lengths, in this project we will care only about estimating the shortest paths and path lengths. The project involves some graph theory, linear algebra, probability, and abstract algebra. It's ideal if you've already taken classes in these subjects, but the problem is "simple enough" that you can learn what you need as you go. For that, I've compiled some warm-up material. There are four suggested warm ups; once you've made it through these, I think we can start! 1. [Warm-up 1](#Warm-up-1-Jupyter-Triangle-Counting): Triangle counting, where graphs meet linear algebra via matrix multiplication 2. [Warm-up 2](#Warm-up-2-reading-Approximate-Matrix-Multiplication): A simple method to approximate matrix multiplication 3. [Warm-up 3](#Warm-up-3-Jupyter-Approximate-Triangle-Counting): Approximate triangle counting via approximate matrix multiplication 4. [Warm-up 4](#Warm-up-4-reading-APSP-all-pairs-shortest-paths): Exact APSP (all-pairs shortest paths) 5. [Warm-up 5](#Warm-up-5-TBD-but-related-to-abstract-algebra): TBD, but related to abstract algebra? ## Other preliminary comments ## > There is a 5th warm up, but I'm still collecting materials for that. I'll update these notes when I do that. ## Revision history ## 1. _[December 28, 2023]_ Initial version 2. _[December 29, 2023]_ Changed the suggested reading for Warm-up 3. _[January 8, 2024]_ Added table-of-contents and other minor wordsmithing. 4. _[February 14, 2024]_ Added some related refs to XTC (approximate triangle counting). 5. _[March 11, 2024]_ Added a new RandNLA survey at the bottom of Warm-up 3. 6. _[January 6, 2026]_ Added parallel computing example to motivate abstract algebra. # Warm-up 1 (Jupyter): Triangle Counting # This first warm-up is intended to familiarize you with the idea that a graph can be viewed as a matrix, and that certain graph-analysis problems can in turn be viewed in the language of linear algebra. You'll see this connection through the concrete example of how to count the number of _triangles_ that exist in an undirected graph. The problem is introduced with code examples in the following Jupyter notebook. Read through and run it. If it's helpful, there is also a video lecture that I recorded when I used this notebook in one of my on-campus classes, CSE 6040. * Triangle Counting, Three Ways: [GitHub](https://github.com/rvuduc/tricount3ways) * Accompanying video walkthrough: [GT MediaSpace—**skip** to the 50:58 mark](https://mediaspace.gatech.edu/media/CSE+6040%2C+Fall+2023A+10-26+%E2%80%94+Linear+algebra+re-review+%2B+triangle+counting+%22three+ways%22/1_wg1s7b2a) # Warm-up 2 (reading): Approximate Matrix Multiplication # The execution-time bottleneck of the triangle counting example in Warm-up 1 is **matrix multiplication.** For $n \times n$ matrices, the inputs will be of size $O(n^2)$ but the number of multiplications and additions will scale like $O(n^3)$, which is superlinear in the size of the input, i.e., for $N \equiv n^2$ inputs, the operations scale like $O\left(N^{3/2}\right)$. > _Note:_ For matrix multiplication over real or complex numbers, there is in fact a way to do it exactly in asymptotically fewer operations, i.e., $O(n^{2+\omega})$ operations where $0 < \omega < 1$. For instance, an algorithm by Strassen (1971) has $\omega \approx 0.81$. However, most of these algorithms have large constants or other issues that make them impractical. If we want to further reduce this cost and do not need an _exact_ result, one strategy is to _approximate_ the result. One method for doing matrix multiply approximately is described in the following notes: * Michael W. Mahoney (2016). _Lecture notes on randomized linear algebra._ arXiv:[1608.04481](https://arxiv.org/abs/1608.04481) Read Sections 1 (overview) and 2 (approximate matrix multiply), focusing on the following: - For Section 1, focus on Section 1.4 ("A simple model..."), especially the `Select` algorithm (Algorithm 1), and Section 1.5. In Section 1.4, `Select` is a simple and beautiful example of a randomized algorithm, so understanding it in detail—the model of computation it assumes and how it works—will build your intuition for the more complicated algorithms we'll encounter later in the project. The rest is "big picture" background on randomized numerical linear algebra (RandNLA). Those remarks might be helpful for understanding the context of RandNLA but is not essential for the project. - For Section 2, focus on the structure of algorithms and implications of any lemmas/theorems. You are welcome to skim or even skip the detailed analysis and proofs, which we can revisit later in the project whenever we think we need them. As an additional coding exercise, you might implement the `Select` algorithm and run experiments to verify that it really works as claimed. > _Note:_ For a more terse but condensed summary of approximate matrix multiply, see [these slides](https://www.stat.berkeley.edu/~mmahoney/talks/rla_simons_bootcamp_f18.pdf). For a shorter but more recent set of notes co-authored by Mahoney, see Petros Drineas and Michael W. Mahoney (2017), _Lectures on Randomized Numerical Linear Algebra,_ arXiv:[1712.08880](https://doi.org/10.48550/arXiv.1712.08880) (Section 4). # Warm-up 3 (Jupyter): Approximate Triangle Counting # Let's combine Warm-ups 1 + 2 by replacing the exact matrix multiply from triangle counting (warm-up 1) with the approximate matrix multiply (warm-up 2). Here is a Jupyter notebook that does just that. * Approximate Triangle Counting via Approximate Matrix Multiply: [GitHub](https://github.com/hpcgarage/approx-tricount). _Note:_ This repository is private, so send Rich your github.com username if you need access. > Koby found an approximate triangle counting method based on trace estimation (see [his post on Slack](https://hpcgarage.slack.com/archives/C06CRDXD7JP/p1707932482832759)). We should run down this reference and everything that cites it (see Slack). It is based on trace estimation. For that, there is a randomized method to estimate the trace; see Section 5 of this [Murray et al. (2023) RandNLA survey](https://arxiv.org/abs/2302.11474). # Warm-up 4 (reading): APSP, all-pairs shortest paths # The problem of computing all-pairs shortest paths (APSP) can also be viewed as a kind of matrix multiplication. So if we can approximate matrix multiply, then perhaps we can also approximate APSP. The tricky part, which is why this project is a research project, is that APSP involves matrix multiply over an algebraic structure called the _tropical semiring_, which is an instance of a _dioid_ ("die-oyd"). Don't worry if that sounds like gibberish; we'll gradually work our way toward understanding what this concept means in more detail later. (See the note about abstract algebras, below.) However, it is this fact that means applying a randomized approximate matrix multiply like you did for triangle counting will not be straightforward. (Well, it will be easy to apply, but I think it will be much harder to analyze.) For this warm-up, read about the APSP problem and its most famous solution, which is the Floyd-Warshall algorithm. A good online reference is the following: * Jeff Erickson (2019). "[Chapter 9: All-pairs shortest paths](https://jeffe.cs.illinois.edu/teaching/algorithms/book/09-apsp.pdf)," in _Algorithms_, 1st edition. Online: https://jeffe.cs.illinois.edu/teaching/algorithms/ You can focus on Sections 9.1, 9.2 ("obvious" APSP), and 9.8 ("(Kleene-Roy-)Floyd-Warshall(-Ingerman)"). The latter is the "Floyd-Warshall" algorithm that is usually covered in CS 3510. Take special note of Erickson's description of APSP using a "funny" matrix multiply. That is the concept that will connect matrix multiply "over the reals" (the one you learn about in MATH 1554) with one defined in the so-called "tropical semiring." # Warm-up 5: TBD, but related to abstract algebra? > This warm-up section is a placeholder for a better exercise, reading, or presentation. In the meantime, skim the following to get some background. **A note about _abstract algebra_**. One way to understand the connection between some graph problems and linear algebra is through [abstract algebra](https://en.wikipedia.org/wiki/Abstract_algebra). You might encounter this mathematical tool in MATH 4107 at GT, but if it's your first time seeing it, here is some background about the topic that you may find helpful. One of the first "algebras" you learn in school are _the reals_, a term which is shorthand for the _[real-valued number field](https://en.wikipedia.org/wiki/Real_number#:~:text=The%20real%20numbers%20form%20an,that%20have%20the%20following%20properties.)_. In this algebra, values are real numbers, like `5`, `-100.3`, `4/3`, `3.14159...`, and so on. You can combine values to produce other values via basic arithmetic operations, like addition and multiplication. These operations have special properties. For instance, they have inverses (subtraction and division) and are commutative (`a + b = b + a`, `a * b = b * a`). When we "change algebras," we can change all of these aspects of the algebraic system. We might replace real-number values by integers, or we might replace addition with something that is _not_ invertible, like `min(a,b)` or `max(a,b)`. When the structure of two algebras is the same, it means we can prove something in one algebra that will hold "automatically" in the other algebra. So the power of thinking about algebras in an abstract way is to facilitate this translation of facts and ideas in one domain to another. For example, when we change the domain of values from real numbers to complex numbers, we can define the basic operations (addition and multiplication) in such a way that facts about reals are also true for complex numbers. When two algebras are different, some facts about one algebra may carry over to the other, but other facts may "break" if we try to check them in the other algebra. There are many hidden and practical applications of abstract algebra in computer science, including in software engineering and parallel computing. In software engineering, the idea of object types is an application of abstract algebra. For instance, consider inheritance in an object-oriented language. When you derive a child class from the base class, you might supply a different implementation in the child of a method or function from the base class. That is analogous to changing the "definition" of "addition" from `a + b` to `min(a, b)`, as we do when we "redefine" matrix multiply in the APSP problem. In parallel computing, if an operation commutes, such as `foo(a,b) = foo(b,a)`, we can exploit this ability to "reorder" the evaluation of `a` and `b` to speed up a calculation. For example, suppose you are told that - [R1] `foo(a, foo(b, c)) = foo(foo(a, b), c)` - [R2] `foo(a, b) = foo(b, a)`, which would be true if, for example, `foo(a, b) = a + b` and `a` and `b` represent real-valued numbers. Now suppose you wish to calculate the value of `foo(foo(w, x), foo(y, z))` where - computing `w` and `x` takes 5 seconds each; - computing `y` and `z` takes 2 seconds each; - computing `foo(a, b)` takes 1 second; - and you are allowed to use two identical processors concurrently to help complete this calculation. Then `foo(foo(w, x), foo(y, z))` must take - 5+5+1 = 11 seconds to calculate `a = foo(w, x)` on one processor; - 2+2+1 = 5 seconds to calculate `b = foo(y, z)` on the other processor. - Since these processors run at the same time, their total time is max(11, 5) = 11 seconds. - Since it takes 1 additional second to combine the two results, i.e., evaluate `foo(a, b)`, then the total time is max(11, 5)+1 = 12 seconds. This can be done faster by exploiting the algebraic properties of `foo` given by the rules [R1] and [R2]. In particular, first apply them to show that - `foo(foo(w, x), foo(y, z))` = `foo(foo(y, w), foo(x, z))`. _(Exercise!)_ Then the time to evaluate this equivalent formula would be > - 2+5+1 = 8 seconds to evaluate `foo(y, w)`, > - 2+5+1 = 8 seconds to evaluate `foo(x, z)`, > - and 1 second to combine them, for a total of max(8, 8)+1 = 9 seconds, about 1.33x faster. The point is that the _algebraic structure_ (the rules [R1] and [R2]) made it possible to rearrange the steps of the computation so that the whole evaluation could be done faster (and still correctly).