w_lstsqs

Next: w_lstsqs Up: w_lstsqs Previous: w_lstsqs

6. Explanation of the method

One way to invent the method of normal equations is to use calculus. The error is $E (x _ 1,\dots, x _ n)$ . Take the partial derivative with respect to each and set it equal to zero. You get simultaneous equations in $x _ 1,\dots, x _ n$ , which turn out to be the same as the normal equations.

Here is a better way, based on the geometrical interpretation discussed above. Recall that

(i) the set of all vectors of the form x is a subspace, namely the column space of (the subspace of R spanned by the columns of );

(ii) we are looking for the particular x for which x is closest to b. To avoid confusion, call this vector x;

(iii) the error vector is $\epsilon = A$ x b.

We need only two more facts:

(iv) The shortest distance from a vector b to a subspace is along a perpendicular. (See Problems for a verification.)

(v) In a matrix product , the entries are dot products of rows of with columns of .

Now, by (iv), $\epsilon$ is perpendicular to all vectors in . By (i), the columns of are in , so $\epsilon$ is perpendicular to each column of . Thus the dot product of $\epsilon$ with every column of is zero. Write the dot products by putting on its side and writing the matrix product $A ^ t \epsilon =$ 0.

This says x b, so x b, the normal equations!

Next: w_lstsqs Up: w_lstsqs Previous: w_lstsqs

Kirby A. Baker 2003-05-13