Linear regression

Problem

Consider the data:

i xi yi
1 1 1
2 2 1
3 3 2
4 4 2

Find a straight line

f(x)=c0+c1x such that
i=14(f(xi)yi)2
is minimized.

Thought

For any given data set

(x1,y1),,(xN,yN) and the straight line
f(x)=c0+c1x
, the key observation here is that

[f(x1)f(xN)]=[c0+c1x1c0+c1xN]=c0[11]+c1[x1xN]=[1x11xN][c0c1].

With

A=[1x11xN], c=[c0c1], and y=[y1yN],

we are looking for appropriate

c to minimize
Acy2
. This is a least square problem, and we know that the answer is
c=(AA)1Ay
.

Sample answer

Let

A=[11121314] and y=[1122].

Then the answer is

c=(AA)1Ay=[4101030]1[11111234][1122]=[0.50.4].

Thus,

f(x)=0.5+0.4x is the straight line that best describes the data, which is also known as the best fitting line .

This note can be found at Course website > Learning resources.