Applied Linear Algebra

# Applied Linear Algebra: Unlocking Real-World Solutions ## What is Applied Mathematics? * **The Bridge:** Applied mathematics connects theoretical concepts to practical problems across diverse fields like science, engineering, business, and more. * **Problem-Solvers:** Applied mathematicians develop and use models to understand and solve real-world challenges. * **Tools of the Trade:** Models, simulations, and computational methods help us analyze and predict phenomena in the world around us. ## Why is Applied Mathematics Essential? * **Problem-Solving Powerhouse:** Applied mathematics empowers us to tackle complex problems and make informed decisions in various domains. * **Fueling Innovation:** It drives advancements in technology, medicine, climate science, finance, artificial intelligence, and numerous other fields. * **Interdisciplinary Impact:** Applied mathematicians work alongside experts from different disciplines to find solutions to shared challenges. ## Programming: Empowering Applied Mathematics * **Essential Skillset:** Programming is a vital tool for modern applied mathematicians, allowing us to implement and experiment with our models. * **Why Programming Matters:** * **Bringing Models to Life:** Programming translates mathematical concepts into executable code, enabling us to test and refine our solutions. * **Rapid Experimentation:** We can quickly iterate on our models and algorithms, exploring different scenarios and optimizing performance. * **Simulation Power:** Programming allows us to simulate complex systems that may be difficult or impossible to analyze analytically. * **Visual Insights:** Graphs and visualizations generated through programming help us understand and communicate our findings effectively. * **Languages and Tools:** * **Python:** Widely used for its simplicity, versatility, and extensive libraries for scientific computing (NumPy, SciPy, Matplotlib). * **Julia:** A newer language combining the ease of Python with the performance of C++, gaining popularity in scientific computing. * **R:** A powerful language for statistical analysis, data visualization, and machine learning. ## Linear Algebra: The Universal Language * **Foundation of Models:** Linear algebra is the mathematical language used to describe relationships between variables in a wide range of models across science and engineering. * **Data Analysis Powerhouse:** It's essential for understanding and analyzing data, from basic linear regression to advanced machine learning algorithms. * **Optimization Expertise:** Linear algebra helps us find optimal solutions under constraints or limitations, a common need in many applications. * **Computational Cornerstone:** Numerical methods used in simulations and models heavily rely on linear algebra for their accuracy and efficiency. ## Clustering: Discovering Hidden Structures * **The Goal:** Group similar data points into meaningful clusters, revealing underlying patterns and relationships. * **Applications:** * **Recommendation Systems:** Grouping users with similar preferences. * **Image Segmentation:** Identifying objects or regions within images. * **Anomaly Detection:** Detecting unusual patterns in data. * **Market Segmentation:** Grouping customers based on demographics or behavior. ### The Math (Simplified) * We aim to minimize the average squared distance between data points and the center of their assigned cluster. This is known as the clustering objective $J_{\text{clust}}$. A lower $J_{\text{clust}}$ indicates better clustering. $$J_{\text{clust}} = \frac1N \sum_{i=1}^N ||x_i-z_{c_i}||^2$$ * where: * $x_i$ is a data point * $z_{c_i}$ is the center of the cluster that $x_i$ belongs to * N is the total number of data points ![Screenshot 2024-06-03 at 10.20.14 AM](https://hackmd.io/_uploads/BJ2PdiqER.png) ## Least Squares – Finding the Best Fit in Noisy Data * **The Goal:** Find the line (or curve) that best represents your data, even when it's noisy or imperfect. * **Why It's Useful:** * **Trend Analysis:** Uncover underlying patterns and relationships in data. * **Prediction:** Forecast future values based on past observations. * **Model Fitting:** Find parameters of models that describe your data. * **The Least Squares Problem:** Find the solution that minimizes the sum of the squared errors between the predicted and actual values. $$\|Ax−b\|^2$$ * where * $A$ is your system of equations * $x$ is the unknown solution * $b$ is your data * **Methods:** 1. **Normal Equations:** A direct way to solve for the optimal solution, but can be numerically unstable in some cases. $$(A^T A) \hat{x} = A^T b$$ 2. **QR Decomposition:** A more computationally robust method, often preferred for its stability. $$A=QR, \ \hat{x}=R^{−1} Q^T b$$ ## Data Fitting with Least Squares * **The Goal:** Find the best function to describe relationships in your data. * **Choosing a Model:** * **Linear:** Simple relationships $f(x) = ax+b$. ![Screenshot 2024-06-01 at 10.07.37 PM](https://hackmd.io/_uploads/Hk04ssdNR.png) * **Polynomial:** Curved relationships $f(x)= \theta_0+\theta_1 x+\dots \theta_p x^p$. ![Screenshot 2024-06-01 at 10.09.31 PM](https://hackmd.io/_uploads/SkkhijOVA.png) * **Piecewise Polynomial:** Data with changing trends. ![Screenshot 2024-06-01 at 11.45.28 PM](https://hackmd.io/_uploads/HkQmMTu4A.png) * **Model Validation:** Prevent overfitting by evaluating your model on unseen data (holdout validation or cross-validation). ![Screenshot 2024-06-01 at 10.11.28 PM](https://hackmd.io/_uploads/rJTmhsuN0.png) ### Case Study: Predicting House Prices * **The Task:** Build a model to estimate house prices based on various features. * **Our Approach:** * **Data:** Used information on square footage, location, condition, etc. * **Model:** Employed Ridge Regression, a linear model designed to handle many features and avoid overfitting. $$\min_\theta \frac1N \sum_{i=1}^N ||x_i \cdot \theta - y_i||^2 + \lambda ||\theta||^2$$ * **The Results:** * **13% error** in price prediction. * **Key factors:** Year built, living area, basement size, condition, and quality of materials. ![Screenshot 2024-06-02 at 12.47.00 AM](https://hackmd.io/_uploads/HJxqlCdVR.png) ![Screenshot 2024-06-02 at 12.45.23 AM](https://hackmd.io/_uploads/B1RXgRuNA.png) ![Screenshot 2024-06-02 at 12.45.48 AM](https://hackmd.io/_uploads/HJcHg0uEC.png) ## Classification: Making Decisions from Data * **The Goal:** Teach machines to categorize data into meaningful groups. * **Types:** * **Binary:** Two categories (e.g., spam/not spam). * **Multi-Class:** More than two categories (e.g., types of flowers). * **How it Works:** 1. **Gather Labeled Data:** Examples with known categories. 2. **Train a Model:** Learn patterns that distinguish categories. 3. **Make Predictions:** Categorize new data based on the learned patterns. * **Least Squares Classification:** A simple method that finds a linear boundary to separate classes. $$\hat{f}(x) = \text{sign}(\tilde{f}(x))$$ ![Screenshot 2024-06-01 at 10.17.21 PM](https://hackmd.io/_uploads/ryAOTjuNC.png) ## Multi-Objective & Constrained Least Squares * **Multi-Objective:** Optimize multiple, potentially conflicting objectives simultaneously (e.g., maximizing profit while minimizing risk). \begin{split}\min_x J \quad \text{s.t.} \quad &J = \lambda_1 J_1 + \dots + \lambda_n J_n,\ \sum_i \lambda_i=1 \\ &J_1 = \|A_1 x - b_1\|^2, \dots, J_n = \|A_n x - b_n\|^2\end{split} * **Pareto Optimality:** Find solutions where improving one objective makes another worse. These solutions form a "Pareto frontier" representing the best possible trade-offs. ![Screenshot 2024-06-01 at 10.27.06 PM](https://hackmd.io/_uploads/rJ8a1h_EA.png) * **Constrained:** Find the best solution while satisfying certain constraints (e.g., budget limitations, physical limits). \begin{split} \min \quad& ||Ax−b||^2\\ \text{u.c.} \quad& Cx=d \end{split} ### Case Study: Portfolio Optimization * **The Goal:** Allocate investments to maximize returns while minimizing risk. * **The Constraints:** * **Budget:** Limited funds to invest. * **Long-Only:** No short-selling (betting against assets). * **The Model:** Use constrained least squares to find the portfolio weights that best balance risk and return. \begin{split} \min_w &\ -Rw + \lambda \cdot \text{std}(Rw)^2 \\ \text{u.c.} &\ \mathbf{1}^t w=\mathbf{1}\end{split} ![Screenshot 2024-06-01 at 10.41.15 PM](https://hackmd.io/_uploads/By8zX3uV0.png) ![Screenshot 2024-06-01 at 10.41.49 PM](https://hackmd.io/_uploads/BkdEm2_NR.png) ### Case Study: Linear Algebra in Self-Driving Car Control * **The Challenge:** Control a car to reach its destination safely and efficiently. * **Linear Algebra's Role:** * **Modeling the Car's Motion:** Equations describing how the car responds to control inputs. * **Path Planning:** Finding the optimal path to the destination. * **Balancing Objectives:** Minimizing fuel consumption while maximizing safety and speed. \begin{split} \min_u& \ \sum_{k=1}^n \|C x_k\|^2 + \gamma \sum_{k=1}^{n-1} \|u_k\|^2 \\ \text{s.t.}& \ x_{k+1} = f(x_k, u_k) \quad k=2,\dots,n-1 \\ &\ x_1 = x^{\text{init}},\ x_T = x^{\text{des}} \end{split} ![Screenshot 2024-06-02 at 12.22.19 AM](https://hackmd.io/_uploads/S1vpqTO4C.png) ![Screenshot 2024-06-02 at 12.22.35 AM](https://hackmd.io/_uploads/SyU0qaO4C.png) ## Q&A and Discussion ## Reference * [ENGR108: Introduction to Matrix Methods](https://stanford.edu/class/engr108/) * [EE104/CME107: Introduction to Machine Learning](https://ee104.stanford.edu/lectures.html) * [Introduction to Applied Linear Algebra – Vectors, Matrices, and Least Squares](https://web.stanford.edu/~boyd/vmls/) * [QuantEcon](https://quantecon.org/)