Polynomial interpolation

$$ %% Sets and functions %% \newcommand{\set}[1]{\{ #1 \}} \newcommand{\Set}[1]{\left \{ #1 \right\}} \renewcommand{\emptyset}{\varnothing} \newcommand{\N}{\mathbb{N}} \newcommand{\Z}{\mathbb{Z}} \newcommand{\R}{\mathbb{R}} \newcommand{\Rn}{\mathbb{R}^n} \newcommand{\Rm}{\mathbb{R}^m} \newcommand{\C}{\mathbb{C}} %% Linear algebra %% \newcommand{\abs}[1]{\lvert #1 \rvert} \newcommand{\Abs}[1]{\left\lvert #1 \right\rvert} \newcommand{\inner}[2]{\langle #1, #2 \rangle} \newcommand{\Inner}[2]{\left\langle #1, #2 \right\rangle} \newcommand{\norm}[1]{\lVert #1 \rVert} \newcommand{\Norm}[1]{\left\lVert #1 \right\rVert} \newcommand{\trans}{{\top}} \newcommand{\span}{\mathop{\mathrm{span}}} $$

Lagrange interpolation

Suppose that $f$ is a real-valued function defined on a set of distinct nodes $\set{x_j}_{j=0}^n \subseteq \R$ and let $P_n$ denote the vector space of real polynomial functions of degree at most $n$. Then there exists a unique $p_n \in P_n$ that interpolates $f$ at the nodes in the sense that $p_n(x_j) = f(x_j)$ for each $j$: $$ p_n = \sum_j f(x_j) \ell_j, $$ where $\ell_j := \prod_{i \neq j} \frac{\bullet - x_i}{x_j-x_i} \in P_n$. Clearly, $p_n$ interpolates $f$ since $\ell_j(x_i) = \delta_{ij}$, and it is unique because if $q_n \in P_n$ also interpolates $f$ (at the same nodes), then $p_n - q_n \in P_n$ has $n + 1$ distinct zeroes and must therefore be the zero polynomial. The polynomials $\set{\ell_j}_{j=0}^n$ constitute a basis of $P_n$ called the Lagrange basis.

Defining the barycentric weights $w_j := \frac{1}{\prod_{i \neq j} (x_j - x_i)}$, which notably depend only on the nodes, we can express $p_n$ as $\omega \sum_j \frac{w_j f(x_j)}{\bullet - x_j}$ (except at the nodes – where the values of $p_n$ are given), where $\omega := \prod_i (\bullet - x_i)$ is the nodal polynomial. In particular, taking $f = 1$ implies that $\omega \sum_j \frac{w_j}{\bullet - x_j} = 1$, whence we obtain the barycentric formula $$ p_n = \frac{\sum_j \frac{w_j f(x_j)}{\bullet - x_j}}{\sum_j \frac{w_j}{\bullet - x_j}}. $$

Hermite interpolation

More generally, suppose that $f$ is a real-valued function defined on a multiset of nodes $\set{x_j^{1+\mu_j}}_{j=0}^m \subseteq \R$, where $x_j$ has multiplicity $1 + \mu_j$ and $f$ is $\mu_j$ times differentiable at $x_j$. Then there exists a unique polynomial $p_n \in P_n$, where $n = m + \sum_j \mu_j$, that interpolates $f$ at the nodes in the sense that $p_n^{(k)}(x_j) = f^{(k)}(x_j)$ for each $j$ and each $0 \leq k \leq \mu_j$.

Indeed, such a polynomial can be constructed recursively as follows: if $m > 0$, let $p_- \in P_{n-1}$ interpolate $f$ at $\set{x_j^{1 + \mu_j}}_{j=0}^m \setminus \set{x_m^1}$ and $p_+ \in P_{n-1}$ interpolate $f$ at $\set{x_j^{1+\mu_j}}_{j=0}^m \setminus \set{x_0^1}$, and define $p_n \in P_n$ as $$ \begin{align*} p_n &:= \frac{x_m - \bullet}{x_m - x_0} p_- + \frac{\bullet - x_0}{x_m - x_0} p_+ \\ &\hphantom{:}= p_- + \frac{\bullet - x_0}{x_m - x_0} (p_+ - p_-). \end{align*} $$ Clearly, $p_n(x_j) = f(x_j)$ for each $j$ since $p_-(x_j) = f(x_j) = p_+(x_j)$ for $0 < j < m$, $p_-(x_0) = f(x_0)$, and $p_+(x_m) = f(x_m)$. Moreover, for $k > 0$, we have $$ p_n^{(k)} = p_-^{(k)} + \frac{\bullet - x_0}{x_m - x_0} (p_+ - p_-)^{(k)} + k \cdot \frac{1}{x_m - x_0} (p_+ - p_-)^{(k-1)}, $$ so, similarly, $p_n^{(k)}(x_j) = f^{(k)}(x_j)$ for each $j$ and each $0 < k \leq \mu_j$. In the base case $m = 0$, we take $p_n$ to be the $\mu_0$th degree Taylor polynomial of $f$ centred at $x_0$. Uniqueness follows as above, counting zeroes with multiplicity.

By specifying appropriate values for $f$ and its derivatives at the nodes, we can construct $h_{jk} \in P_n$ such that $p_n = \sum_j \sum_k f^{(k)}(x_j) h_{jk}$ for all $f$. These polynomials $\set{h_{jk}}$ constitute a basis of $P_n$ called the Hermite basis. For instance, if $\mu_j = 0$ for each $j$, then $h_{j, 0} = \ell_j$ and we recover the Lagrange basis functions; if instead $m = 0$, then $h_{0, k} = \frac{1}{k!}(\bullet-x_0)^k$, which is sometimes called a Taylor basis function.

Newton interpolation

Newton interpolation recasts Lagrange/Hermite interpolation in a more explicit basis in which the coefficients of the interpolating polynomial can still be efficiently computed. Let $t_0, \dots, t_n$ be an enumeration of the nodes $\set{x_j^{1+\mu_j}}_{j=0}^m$ (with multiplicity, in any order). The Newton basis $\set{\omega_j}_{j=0}^n$ of $P_n$ is then defined as $\omega_j := \prod_{i < j} (\bullet-t_i)$.

To compute the coefficients of the interpolating polynomial $p_n$ in this basis, we define the divided difference $f[t_0, \dots, t_n]$ as the $(\bullet)^n$ coefficient of $p_n$ in the monomial basis $\set{(\bullet)^j}_{j=0}^n$ (which for brevity we will refer to as the “leading coefficient” despite the fact that it may be zero). The coefficients of $p_n$ in the Newton basis are then successive divided differences of $f$.

Coefficients of interpolating polynomial in Newton basis $$ p_n = \sum_j f[t_0, \dots, t_j] \omega_j $$

Proof. Write $p_n = \sum_j c_j \omega_j$. For each $j$, the polynomial $\sum_{k \leq j} c_k \omega_k \in P_j$ interpolates $f$ at $\set{t_0, \dots, t_j}$ since $\omega_k(t_i) = 0$ (with multiplicity) for all $i \leq j < k$. Clearly, its leading coefficient is $c_j$, so by definition, $c_j = f[t_0, \dots, t_j]$. ∎

From Lagrange interpolation, we obtain an explicit formula for divided differences when the $t_j$ are distinct in terms of the barycentric weights: $$ f[t_0, \dots, t_n] = \sum_j w_j f(t_j). $$ More generally, the recursive construction in Hermite interpolation shows that divided differences obey a recurrence relation.

Recurrence relation for divided differences

Suppose that the nodes are ordered such that $t_0 = t_n$ implies that $t_0 = \cdots = t_n$. Then $$ f[t_0, \dots, t_n] = \begin{cases} \displaystyle \frac{f[t_1, \dots, t_n] - f[t_0, \dots, t_{n-1}]}{t_n - t_0} & \text{if $t_0 \neq t_n$}, \\ \displaystyle \frac{f^{(n)}(t_0)}{n!} & \text{if $t_0 = t_n$}. \end{cases} $$

Proof. This follows immediately from the construction above: namely, if $t_0 \neq t_n$, then $p_n = p_- + \frac{\bullet - t_0}{t_n - t_0}(p_+ - p_-)$, where $p_- \in P_{n-1}$ interpolates $f$ at $\set{t_0, \dots, t_{n-1}}$ and $p_+ \in P_{n-1}$ interpolates $f$ at $\set{t_1, \dots, t_n}$; otherwise, if $t_0 = t_n$, then $p_n$ is the $n$th degree Taylor polynomial of $f$ centred at $t_0$. ∎

This also yields a recursive algorithm for evaluating $p_n$ known as Neville’s algorithm. To wit, suppose that the nodes are ordered such that $t_i = t_j$ for $i < j$ implies that $t_i = \cdots = t_j$, and let $p_{i, j} \in P_{j-i}$ interpolate $f$ at $\set{t_i, \dots, t_j}$ so that $p_n = p_{0, n}$. Then $$ p_{i, j}(t) = \begin{cases} \displaystyle \frac{(t-t_i) p_{i+1, j}(t) - (t-t_j) p_{i, j-1}(t)}{t_j - t_i} & \text{if $t_i \neq t_j$}, \\ \displaystyle \sum_{k = 0}^{j-i} \frac{f^{(k)}(t_i)}{k!} (t-t_i)^k & \text{if $t_i = t_j$}. \end{cases} $$

Properties of divided differences

  • (Linearity) If $\alpha, \beta \in \R$, then $(\alpha f + \beta g)[t_0, \dots, t_n] = \alpha \cdot f[t_0, \dots, t_n] + \beta \cdot g[t_0, \dots, t_n]$.
  • (Symmetry) If $\sigma$ is a permutation of $\set{0, \dots, n}$, then $f[t_0, \dots, t_n] = f[t_{\sigma(0)}, \dots, t_{\sigma(n)}]$.
  • (Factor property) If $n \geq 1$, then $\big((\bullet - t_0)f\big)[t_0, \dots, t_n] = f[t_1, \dots, t_n]$.

Proof.

  • (Linearity) If $p_f \in P_n$ and $p_g \in P_n$ interpolate $f$ and $g$, respectively, at $\set{t_0, \dots, t_n}$, then $\alpha p_f + \beta p_g \in P_n$ interpolates $\alpha f + \beta g$ at $\set{t_0, \dots, t_n}$.
  • (Symmetry) This is immediate since the definition of the divided difference is independent of the ordering of the nodes.
  • (Factor property) If $p_f \in P_{n-1}$ interpolates $f$ at $\set{t_1, \dots, t_n}$, then $(\bullet-t_0) p_f \in P_n$ interpolates $(\bullet - t_0)f$ at $\set{t_0, \dots, t_n}$. ∎

In fact, the recurrence relation, excluding the base case, can be derived solely from these three properties: $$ \begin{align*} (t_n-t_0) \cdot f[t_0, \dots, t_n] &= \big( ((\bullet - t_0) - (\bullet - t_n)) f \big)[t_0, \dots, t_n] \\ &= \big( (\bullet - t_0) f \big)[t_0, \dots, t_n] - \big( (\bullet - t_n) f \big)[t_0, \dots, t_n] \\ &= f[t_1, \dots, t_n] - f[t_0, \dots, t_{n-1}] \end{align*} $$ Thus, these properties along with the property $f[t_0, \dots, t_n] = \frac{f^{(n)}(t_0)}{n!}$ when $t_0 = \cdots = t_n$ characterize divided differences.

The identity $f[t_0, t_1] = \frac{f(t_1) - f(t_0)}{t_1-t_0}$ for $t_0 \neq t_1$ suggests another relationship between divided differences and derivatives: if, say, $t_0 < t_1$, $f$ is continuous on $[t_0, t_1]$, and $f'$ exists on $(t_0, t_1)$, then the mean value theorem amounts to the assertion that there exists a $\xi \in (t_0, t_1)$ such that $f[t_0, t_1] = f'(\xi)$. This generalizes readily to divided differences and derivatives of higher order.

Mean value theorem for divided differences

Suppose that $a := \min\,\set{t_j}_{j=0}^n < \max\,\set{t_j}_{j=0}^n =: b$. If $f^{(n-1)}$ is continuous on $[a, b]$ and $f^{(n)}$ exists on $(a, b)$, then there exists a $\xi \in (a, b)$ such that $$ f[t_0, \dots, t_n] = \frac{f^{(n)}(\xi)}{n!}. $$

Proof. Let $p_n \in P_n$ interpolate $f$ at $\set{t_0, \dots, t_n}$. Then $f-p_n$ has $n+1$ zeroes in $[a, b]$ (with multiplicity), so by repeated applications of Rolle’s theorem, $(f-p)^{(n)} = f^{(n)} - f[t_0, \dots, t_n] n!$ has a zero $\xi \in (a, b)$. ∎

As a consequence, we can express the error in polynomial interpolation, which for $t_0 = \cdots = t_n$ reduces to the statement of Taylor’s theorem.

Polynomial interpolation error

Suppose that $a := \min\,\set{t_j}_{j=0}^n \cup \set{t} < \max\,\set{t_j}_{j=0}^n \cup \set{t} =: b$. If $f^{(n)}$ is continuous on $[a, b]$ and $f^{(n+1)}$ exists on $(a, b)$, then there exists a $\xi \in (a, b)$ such that $$ f(t) - p_n(t) = \frac{f^{(n+1)}(\xi)}{(n+1)!} \omega_{n+1}(t) = \frac{f^{(n+1)}(\xi)}{(n+1)!} \prod_j (t-t_j). $$

Proof. Let $p_n \in P_n$ interpolate $f$ at $\set{t_0, \dots, t_n}$. Then $p_n + f[t_0, \dots, t_n, t] \omega_{n+1}$ interpolates $f$ at $\set{t_0, \dots, t_n, t}$, so the result follows from the mean value theorem for divided differences. ∎

We also have the identity $(fg)[t_0, t_1] = f[t_0] \cdot g[t_0, t_1] + f[t_0, t_1] \cdot g[t_1]$, where the case $t_0 = t_1$ is the product rule for derivatives (which also follows from taking $t_1 \to t_0$ in the case $t_0 \neq t_1$). More generally, we have the following identity, which for $t_0 = \cdots = t_n$ reduces to the generalized product rule for derivatives $(fg)^{(n)} = \sum_j \binom{n}{j} f^{(j)} g^{(n-j)}$.

Product rule for divided differences $$ (fg)[t_0, \dots, t_n] = \sum_j f[t_0, \dots, t_j] \cdot g[t_j, \dots, t_n] $$

Proof. Let $p_n \in P_n$ interpolate $f$ at $\set{t_0, \dots, t_n}$. Then $(fg)[t_0, \dots, t_n] = (p_n g)[t_0, \dots, t_n]$ since $fg$ agrees with $p_n g$ on $\set{t_0, \dots, t_n}$. By linearity and the factor property, we have $$ \begin{align*} (p_n g)[t_0, \dots, t_n] &= \left(\sum_j f[t_0, \dots, t_j] \omega_j g\right)[t_0, \dots, t_n] \\ &= \sum_j f[t_0, \dots, t_j] \cdot (\omega_j g)[t_0, \dots, t_n] \\ &= \sum_j f[t_0, \dots, t_j] \cdot g[t_j, \dots, t_n]. \qquad \small \blacksquare \end{align*} $$ Furthermore, from the recurrence relation for divided differences, we see that if $f \in C^0$, then $f[t_0, \dots, t_n]$ is jointly continuous in $t_0, \dots, t_n$ wherever they are distinct; if $f \in C^n$, the mean value theorem for divided differences implies that it is jointly continuous everywhere.