Polynomial regression

Problem

Consider the data:

i xi yi
1 1 2
2 2 0
3 3 0
4 4 1

Find a polynomial

f(x)=c0+c1x+c2x2 such that
∑i=14(f(xi)−yi)2
is minimized.

Thought

For any given data set

(x1,y1),…,(xN,yN) and the straight line
f(x)=c0+c1x+c2x2
, the key observation here is that

[f(x1)⋮f(xN)]=[c0+c1x1+c2x12⋮c0+c1xN+c2xN2]=[1x1x12⋮⋮⋮1xNxN2][c0c1c2].

With

A=[1x1x12⋮⋮⋮1xNxN2], c=[c0c1c2], and y=[y1â‹®yN],

we are looking for appropriate

c to minimize
‖Ac−y‖2
. This is a least square problem, and we know that the answer is
c=(A⊤A)−1A⊤y
.

Sample answer

Let

A=[1111241391416] and y=[2001].

Then the answer is

c=(A⊤A)−1A⊤y=[5.25−4.050.75].

Thus,

f(x)=5.25−4.05x+0.75x2 is the polynomial that best describes the data. One may use desmos to plot the function to see if it is close to the data points.

This note can be found at Course website > Learning resources.