6.2 Continuous least squares#
In discrete least squares, our starting point was a set of data points. Here we will start with a continuous function \(f\) on \([a,b]\) and answer the following question: how can we find the “best” polynomial \(P_{n}(x)=\sum_{j=0}^{n}a_{j}x^{j}\) of degree at most \(n\) that approximates \(f\) on \([a,b]?\) As before, “best” polynomial will mean the polynomial that minimizes the least squares error:
Compare this expression with that of the discrete least squares:
To minimize \(E\) in (52) we set \(\frac{\partial E}{\partial a_{k}}=0\), for \(k=0,1,...,n,\) and observe
which gives the \((n+1)\) normal equations for the continuous least squares problem:
for \(k=0,1,...,n.\) Note that the only unknowns in these equations are the \(a_{j}\)’s; hence this is a linear system of equations. It is instructive to compare these normal equations with those of the discrete least squares problem:
Find the least squares polynomial approximation of degree 2 to \(f(x)=e^{x}\) on \((0,2)\).
Solution.
The normal equations are:
\(k=0,1,2.\) Here are the three equations:
Computing the integrals we get
whose solution is \(a_{0}=3(-7+e^{2})\), \(a_{1}=-\frac{3}{2}(-37+5e^{2}),\) \(a_{2}=\frac{15}{4}(-7+e^{2}).\) Then
The solution method we have discussed for the least squares problem by solving the normal equations as a matrix equation has certain drawbacks:
The integrals \(\int_{a}^{b}x^{i+j}dx=\left(b^{i+j+1}-a^{i+j+1}\right)/(i+j+1)\) in the coefficient matrix give rise to matrix equation that is prone to roundoff error.
There is no easy way to go from \(P_{n}(x)\) to \(P_{n+1}(x)\) (which we might want to do if we were not satisfied by the approximation provided by \(P_{n}\)).
There is a better approach to solve the discrete and continuous least squares problem using the orthogonal polynomials we encountered in Gaussian quadrature. Both discrete and continuous least squares problem tries to find a polynomial \(P_{n}(x)=\sum_{j=0}^{n}a_{j}x^{j}\) that satisfies some properties. Notice how the polynomial is written in terms of the monomial basis functions \(x^{j}\) and recall how these basis functions caused numerical difficulties in interpolation. That was the reason we discussed different basis functions like Lagrange and Newton for the interpolation problem. So the idea is to write \(P_{n}(x)\) in terms of some other basis functions
which would then update the normal equations for continuous least squares (53) as
for \(k=0,1,...,n.\) The normal equations for the discrete least squares (43) gets a similar update:
Going forward, the crucial observation is that the integral of the product of two functions \(\int \phi_j(x) \phi_k(x) dx\), or the summation of the product of two functions evaluated at some discrete points, \(\sum \phi_j(x_i)\phi_k(x_i)\), can be viewed as an inner product \(\left\langle \phi_{j},\phi_{k}\right\rangle \) of two vectors in a suitably defined vector space. When the functions (vectors) \(\phi_{j}\) are orthogonal, the inner product \(\left\langle \phi_{j},\phi_{k}\right\rangle \) is 0 if \(j\neq k\), which makes the normal equations trivial to solve. We will discuss details in the next section.