# Numerical Optimization HW1
## Gradient descent
In this implementation, linear search is utilized to determine the optimal step size, with all other aspects of the algorithm implemented according to the previously presented methodology.
## Newton's method
The convergence criteria state that the optimal point is reached when either the gradient norm at the current position or the step size norm falls below the specified epsilon threshold.
## experimental results
| | Newton-CG | CG | BFGS |
| ---------------------- | --------- | ---- | ---- |
| Current function value | 7.710593324935902e-10 | 1.8197767001736807e-118.144085490284565e-14 | 8.144085490284565e-14 |
| Iterations | 275 | 38 | 38 | 35
| Function evaluations | 296 | 276 | 126 |
| Gradient evaluations | 296 | 92 | 42 |
### Pure Newton's Method vs. Gradient Descent:
When applied to convex functions, Newton's method requires significantly fewer iterations compared to gradient descent.
### Summation for visualization
The visualization results reveal similar convergence patterns among BFGS, CG, and Newton-CG methods. Gradient descent with optimal step size demonstrates distinct behavior, showing rapid initial approach towards the global minimum but experiencing slower convergence in subsequent iterations. In contrast, pure Newton's method exhibits highly efficient convergence behavior, characterized by its swift and direct approach to the global minimum, consistent with the theoretical expectations of Newton-based optimization.
### Performance Comparison:
* BFGS shows the best performance, achieving convergence with the fewest iterations (35)
* CG method requires moderate iterations (38) but needs more function evaluations
* Newton-CG demands the highest number of iterations (275) and function evaluations