next up previous
Next: w_lstsqs Up: w_lstsqs Previous: w_lstsqs

6. Explanation of the method

One way to invent the method of normal equations is to use calculus. The error is $ E (x _ 1,\dots, x _ n)$. Take the partial derivative with respect to each $ x _ i$ and set it equal to zero. You get $ n$ simultaneous equations in $ x _ 1,\dots, x _ n$, which turn out to be the same as the normal equations.

Here is a better way, based on the geometrical interpretation discussed above. Recall that

(i) the set $ W$ of all vectors of the form $ A$   x is a subspace, namely the column space of $ A$ (the subspace of R$ ^ n$ spanned by the columns of $ A$);

(ii) we are looking for the particular x for which $ A$x is closest to b. To avoid confusion, call this vector x$ _ 0$;

(iii) the error vector is $ \epsilon = A$x$ _ 0 -$   b.

We need only two more facts:

(iv) The shortest distance from a vector b to a subspace is along a perpendicular. (See Problems for a verification.)

(v) In a matrix product $ AB$, the entries are dot products of rows of $ A$ with columns of $ B$.

Now, by (iv), $ \epsilon$ is perpendicular to all vectors in $ W$. By (i), the columns of $ A$ are in $ W$, so $ \epsilon$ is perpendicular to each column of $ A$. Thus the dot product of $ \epsilon$ with every column of $ A$ is zero. Write the dot products by putting $ A$ on its side and writing the matrix product $ A ^ t \epsilon =$   0.

This says $ A ^ t (A$   x$ _ 0 -$   b$ ) = 0$, so $ A^t A$   x$ _ 0
= A ^ t$   b, the normal equations!




next up previous
Next: w_lstsqs Up: w_lstsqs Previous: w_lstsqs
Kirby A. Baker 2003-05-13