--- tags: Mathematical modeling and optimization title: 5.2. Least square method and Parameter fitting --- ## 5.2 Least square method and Parameter fitting After categorizing the type of optimization problems in the mathematical point of view, we now are going to step into the engineering aspects. In many real-world situations, we encounter data points that do not perfectly align with a given model or theory. The __*Least Squares Method*__ is the most applied methodology dealing with such situations. In accompanying with a function as logical model type, a static model can be derived via __*Parameter fitting*__. Such models are obtained with certain accountability and are capable to predict the relevant circumstances. ### 5.2.1 Least square method The Least Squares Method is a widely used technique in engineering for fitting mathematical models to data. It is particularly useful when dealing with experimental or observational data that may contain noise or measurement errors. The goal of the least squares method is to find the parameters of a mathematical model that minimize the sum of the squares of the differences between the observed data and the model predictions. This approach allows engineers to estimate unknown parameters and make predictions based on limited or noisy data. Given a set of data points $(x_i,y_i)$ for $i=1,2,...,n$, where $x_i$ represents the independent variable and $y_i$ represents the corresponding dependent variable, the least squares method aims to find the parameters $\theta$ of a mathematical model $f(x;\theta)$ that minimize the sum of the squared residuals: __Minimize:__ $\qquad \displaystyle \large \sum_{i=0}^n \left(y_i−f(x_i;\theta) \right)^2$ The model $f(x;\theta)$ can be any function that depends on the independent variable $x$ and a set of parameters $\theta$. Common examples include linear models, polynomial models, exponential models, and trigonometric models. The parameters $\theta$ are adjusted iteratively to minimize the sum of squared residuals using optimization techniques such as gradient descent or direct optimization methods. ### 5.2.2 Parameter fitting After the basic concept of least square method is introduced. We can apply this method (and also the approaching technique) in parameter determination for our mathematical models incorporated with provided data set. In the engineering point of view, this methodology is fundamental to describe the state and pridicte possible incidents. #### Applying gradient decent in Least Square optimization process Following the technique we learned in the last section, the cost function defined in the least square method can be minimized using gradient decent method, following 1. __Define the Cost Function__: The cost function $J(\theta)$ will quantify the error between the observed data points and the model predictions as $\qquad \displaystyle \large J(\theta_i) = \sum_{i=0}^n \left(y_i−f(x_i;\theta_i) \right)^2$ 2. __Define the Gradient of the Cost Function__: Compute the gradient of the cost function $\nabla J(\theta_i)$ with respect to the parameters $\theta_i$. The gradient indicates the direction of steepest ascent in the parameter space. 3. __Update the Parameters using Gradient Descent__: Update the parameters $\theta$ iteratively using the Gradient Descent algorithm: $\qquad \large \displaystyle \theta_i =\theta_i−\alpha\nabla J(\theta_i)$ where $\alpha$ is the learning rate. 4. __Defined convergence criterion__: Determine a convergence criterion to stop the iterative process, such as reaching a maximum number of iterations or when the change in the cost function becomes negligible. For an example fitting an assumed function $f(x) = ax +b + ce^x$, the $\nabla J(a,b,c)$ yields: $$\begin{align} &\displaystyle \frac{\partial J}{\partial a} = -2\sum_{i=0}^n x\left(y_i−f(x_i; \ a, b, c) \right) \\ &\displaystyle \frac{\partial J}{\partial b} = -2\sum_{i=0}^n \left(y_i−f(x_i;\ a, b, c) \right) \\ &\displaystyle \frac{\partial J}{\partial c} = -2\sum_{i=0}^n e^x\left(y_i−f(x_i;\ a, b, c) \right) \end{align} $$ A python example reads: import numpy as np import matplotlib.pyplot as plt # Define the linear model def linear_model(x, a, b, c): return a * x + b + c *np.exp(x) # Generate example data np.random.seed(0) x_data = np.linspace(0, 2, 50) y_data = 2 * x_data + 1 + 3* np.exp(x_data)+ np.random.normal(0, 1, 50) # Gradient Descent for parameter fitting learning_rate = 0.0001 num_iterations = 1000 tolerance = 1e-6 # Initialize parameters a = 0.0 b = 0.0 c = 0.0 # Perform Gradient Descent for i in range(num_iterations): # Compute predictions predictions = linear_model(x_data, a, b, c) # Compute gradients grad_a = -2 * np.sum((y_data - predictions) * x_data) grad_b = -2 * np.sum(y_data - predictions) grad_c = -2 * np.sum((y_data - predictions) * np.exp(x_data)) # Update parameters a = a - learning_rate * grad_a b = b - learning_rate * grad_b c = c - learning_rate * grad_c # Compute change in cost function cost = np.sum((y_data - predictions) ** 2) if cost < tolerance: break #===================================================# # Plot the data and fitted line plt.scatter(x_data, y_data, label='Data') plt.plot(x_data, linear_model(x_data, a, b, c), color='red', label='Fitted Line') plt.xlabel('x') plt.ylabel('y') plt.title('Least Squares Fitting with Gradient Descent') plt.legend() plt.grid(True) plt.show() print("Fitted Parameters (Gradient Descent):") print("(a, b, c):", a, b, c) Showing: <img src="https://live.staticflickr.com/65535/53515609966_baa5a0c225.jpg"> Fitted Parameters (Gradient Descent): (a, b, c): 0.970, 1.725, 3.131 __See also__ in the python community, least square method is provided in several packages as implemented function, such as: [__*NumPy*__](https://numpy.org/doc/stable/reference/generated/numpy.linalg.lstsq.html), [__*SciPy*__](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.lstsq.html), [__*statsmodels*__](https://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.OLS.html), [__*Scikit-learn*__](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html),[__*PyTorch*__](https://pytorch.org/docs/stable/generated/torch.linalg.lstsq.html) in the linear function combination. Specifying the application, [__*Curve_fit*__](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html) and [__*polyfit*__](https://numpy.org/doc/stable/reference/generated/numpy.polyfit.html) are the tools to find the parameters of a general function or polynomial with given order. __Example__