Differential Equations

$$ \renewcommand{\vec}[1]{\mathbf{#1}} \newcommand{\grad}{\nabla} \newcommand{\abs}[1]{\lvert #1 \rvert} \newcommand{\Abs}[1]{\left\lvert #1 \right\rvert} \newcommand{\inner}[2]{\langle #1, #2 \rangle} \newcommand{\Inner}[2]{\left\langle #1, #2 \right\rangle} \newcommand{\norm}[1]{\lVert #1 \rVert} \newcommand{\Norm}[1]{\left\lVert #1 \right\rVert} \newcommand{\closure}[1]{\overline{#1}} \newcommand{\R}{\mathbb{R}} \newcommand{\Rm}{\mathbb{R}^m} \newcommand{\Rn}{\mathbb{R}^n} \newcommand{\C}{\mathbb{C}} \newcommand{\conj}[1]{\overline{#1}} \newcommand{\res}{\mathrm{res}} \newcommand{\D}{\mathbb{D}} \newcommand{\H}{\mathbb{H}} \newcommand{\Aut}{\mathrm{Aut}} \newcommand{\End}{\mathrm{End}} \newcommand{\F}{\mathbb{F}} \newcommand{\Z}{\mathbb{Z}} \newcommand{\lt}[1]{\mathcal{L}\{#1\}} \newcommand{\Lt}[1]{\mathcal{L}\left\{#1\right\}} \newcommand{\ilt}[1]{\mathcal{L}^{-1}\{#1\}} \newcommand{\Ilt}[1]{\mathcal{L}^{-1}\left\{#1\right\}} $$

Ordinary differential equations (ODEs)

Scalar ODEs

Ordinary differential equation (ODE): equation of the form $F(x, y, y’, \dots, y^{(n)}) = 0$, where $y$ is a function of $x$

Order of an ODE: order of the highest derivative appearing in the equation

Autonomous ODE: $F$ does not (explicitly) depend on $x$

Linear ODE: $F$ is a linear combination of the derivatives of $y$, plus a function of $x$ called the source term/forcing function; that is, an equation of the form $$ \sum_{i=0}^{n} a_i(x) y^{(i)} = b(x) $$ ​ Homogeneous linear ODE: $b(x) \equiv 0$ (in which case $y = 0$ is a trivial solution)

Nonhomogeneous linear ODE: $b(x) \neq 0$

The general solution to a linear ODE is $y = y_c + y_p$, where $y_c$ is the general solution of the homogeneous equation (the complementary solution) and $y_p$ is a solution of the nonhomogeneous equation (a particular solution).

Superposition principle: a linear combination of solutions to a homogeneous linear ODE is also a solution to the ODE 1

A basis for the space of solutions to a homogeneous linear ODE is called a fundamental system of solutions.

First-order scalar ODEs

Picard-Lindelöf theorem

Given the initial value problem $y’ = f(x, y),\ y(x_0) = y_0$, suppose that $f$ is continuous in $x$ and Lipschitz continuous in $y$ (uniformly in $x$). Then there exists a unique solution to the IVP on $[x_0 - \epsilon, x_0 + \epsilon]$ for some $\epsilon > 0$.

Separable equations

To solve a separable equation $y’ = f(x) g(y)$:

  • Integrate: $$ \int \frac{dy}{g(y)} = \int f(x) \, dx $$

First-order linear ODEs

To solve a first-order linear ODE of the form $y’ + p(x) y = f(x)$:

  • Multiply both sides by an integrating factor $r(x) = e^{\int p(x) \, dx}$, which yields $(r(x) \cdot y)’ =r(x) f(x)$

  • Integrate: $$ y = \frac{1}{r(x)} \int r(x) f(x) \, dx = e^{-\int p(x) \, dx} \int e^{\int p(x) \, dx} f(x) \, dx $$

Bernoulli equations

To solve a Bernoulli equation $y’ + p(x) y = q(x) y^n$:

  • Substitute $u = y^{1-n}$, which yields the first-order linear equation $\frac{1}{1-n} u’ + p(x) u = q(x)$

Given a first-order autonomous equation $y’ = f(y)$, the points on the $y$-axis where $f(y) = 0$ are called critical points. If $y_0$ is a critical point, the constant solution $y = y_0$ is called an equilibrium solution.

The phase line of such an equation consists of the $y$-axis along with arrows indicating the sign of $f$ between each of the critical points – ▲ if $f > 0$; ▼ if $f < 0$. If both arrows point toward a critical point, it is stable; if both point away, it is unstable; otherwise, it is semi-stable (sometimes just called unstable).

A general first-order scalar ODE $y’ = f(x, y)$ may be visualized using a slope field, a plot of the $xy$-plane with small line segments of slope $f(x, y)$ at each point $(x, y)$.

Second-order scalar ODEs

We will consider only linear second-order ODEs.

Existence and uniqueness of solutions to second-order linear ODEs

Given the equation $y'' + p(x) y’ + q(x) y = f(x)$, suppose that $p, q, f$ are continuous on some interval $I$. Then for any fixed $x_0 \in I$ and $y_0, y_0'$, there exists a unique solution to the ODE on $I$ satisfying $y(x_0) = y_0, y'(x_0) = y_0'$.

Superposition of solutions to second-order linear homogeneous ODEs

Given the equation $y'' + p(x) y’ + q(x) y = 0$, suppose that $p, q$ are continuous. Then the solution space of the ODE is two-dimensional; viz., if $y_1, y_2$ are linearly independent solutions to the ODE, the general solution is $y = c_1 y_1 + c_2 y_2$.


Now we consider the constant-coefficient second-order linear homogeneous ODE $ay'' + by’ + cy = 0$, where $a, b, c \in \R$. Assuming the solution is of the form $y = e^{\lambda x}$ (with $\lambda$ possibly non-real), we obtain the characteristic equation $a\lambda^2 + b\lambda + c = 0$.

Second-order linear homogeneous ODEs with constant coefficients

To solve an ODE of the form $ay'' + by’ + cy = 0$:

  • Compute the roots $\lambda_1, \lambda_2$ of its characteristic equation

  • If $\lambda_1, \lambda_2$ are real and distinct ($b^2 - 4ac > 0$), the general solution is $y = c_1 e^{\lambda_1 x} + c_2 e^{\lambda_2 x}$ 2

  • If $\lambda_1, \lambda_2$ are real and equal ($b^2 - 4ac = 0$), the general solution is $y = c_1 e^{\lambda_1 x} + c_2 x e^{\lambda_1 x}$

  • If $\lambda_1, \lambda_2$ are complex ($b^2 - 4ac < 0$) with $\lambda_{1, 2} = \mu \pm \omega i$, the general solution is $y = c_1 e^{\mu x} \cos(\omega x) + c_2 e^{\mu x} \sin(\omega x)$

Reduction of order of a second-order linear ODE

Given the equation $y'' + p(x) y’ + q(x) y = f(x)$:

  • Let $y_1$ be a solution to the homogeneous equation $y'' + p(x) y’ + q(x) y = 0$
  • Substitute the ansatz $y_2 = v(x) y_1$, which yields a first-order linear ODE in $y’$
  • Solving for $y_2$ yields a second, linearly independent solution to the second-order ODE

In particular, reduction of order applied to $ay'' + by’ + cy = 0$ in the case $\lambda_1 = \lambda_2 = -b/{2a}$ with $y_1 = e^{\lambda_1 x}$ yields $y_2 = xe^{\lambda_1 x}$.


The (translational mechanical) harmonic oscillator is modelled by the constant-coefficient second-order linear ODE $$ m x''(t) + c x'(t) + kx(t) = F(t), $$ where $x(t)$ is the position of the object as a function of time, $m > 0$ is the mass of the object, $c \geq 0$ is the damping coefficient, $k > 0$ is the spring constant, and $F$ is a driving/forcing function.

Several physical systems are harmonic oscillators, as exemplified by the following table.

Mass-spring system Series RLC circuit Pendulum
$mx'' + cx’ + kx = F(t)$ $Lq'' + Rq’ + q/C = \mathcal{E}(t) $ $L \theta'' + g\theta = 0$
Position $x$ Charge $q$ Angle $\theta$
Mass $m$ Inductance $L$ Length $L$
Damping coefficient $c$ Resistance $R$
Spring constant $k$ Inverse capacitance $1/C$ Gravitational acceleration $g$
Driving force $F$ EMF $\mathcal{E} $

Damped; undamped harmonic oscillator: $c > 0$; $c = 0$

Forced; unforced/free harmonic oscillator: $F \not\equiv 0$; $F \equiv 0$

Undamped/natural angular frequency: $\omega_0 = \sqrt{k/m}$ (i.e., the angular frequency of the oscillator if it were undamped) 3

Damping ratio: $\zeta = c/(2\sqrt{mk})$

We first consider the behaviour of unforced oscillators.

Behaviour of an unforced undamped harmonic oscillator

If $c = 0$ (i.e., $\zeta = 0$), the oscillator is undamped. The solution $$ x = \sqrt{c_1^2 + c_2^2} \cos \left( \omega_0 t - \tan^{-1} \left(\frac{c_2}{c_1}\right) \right) $$ oscillates with amplitude $\sqrt{c_1^2 + c_2^2}$, angular frequency $\omega_0 $, and phase shift $\tan^{-1} (c_2 / c_1)$.

If the oscillator is damped, we distinguish three cases, depending as above on the sign of the discriminant of the characteristic polynomial.

Behaviour of an unforced damped harmonic oscillator

  • If $c^2 - 4mk > 0$ (i.e., $\zeta > 1$), the oscillator is overdamped

    • The solution $x = c_1 e^{\lambda_1 t} + c_2 e^{\lambda_2 t}$ decays as $t \to \infty$ and does not oscillate
    • Its decay constants are $-\lambda_{1, 2} = \omega_0 \zeta \mp \omega_0\sqrt{\zeta^2 - 1}$
  • If $c^2 - 4mk = 0$ (i.e., $\zeta = 1$), the oscillator is critically damped

    • The solution $x = c_1 e^{\lambda_1 t} + c_2 t e^{\lambda_1 t}$ decays as $t \to \infty$ and does not oscillate
    • Its decay constant is $-\lambda_1 = \omega_0 \zeta $
  • If $c^2 - 4mk < 0$ (i.e., $\zeta < 1$), the oscillator is underdamped

    • The solution $x = e^{\mu x} (c_1 \cos(\omega_1 t) + c_2 \sin(\omega_1 t)) $ decays as $t \to \infty$ while oscillating 4

      • Its decay constant is $-\mu = \omega_0 \zeta$ and the angular pseudo-frequency of its oscillations is $\omega_1 = \omega_0 \sqrt{1 - \zeta^2}$

Now suppose the oscillator is subject to sinusoidal forcing $F(t) = F_0 \cos(\omega t)$.

Behaviour of a forced undamped harmonic oscillator

  • If $\omega \neq \omega_0$, there are beats
    • The particular solution is $x_p = F_0 \cos(\omega t) / (m(\omega_0^2 - \omega^2))$
    • This combines with $x_c$ to produce beats; the closer $\omega$ is to $\omega_0$, the greater the amplitude of the beats
  • If $\omega = \omega_0$, there is resonance
    • The particular solution is $x_p = F_0 t \sin(\omega t) / (2m\omega)$
    • This dominates $x_c$ as $t \to \infty$, producing increasingly large oscillations at the resonant frequency

Behaviour of a forced damped harmonic oscillator

The particular solution $x_p$ is a sinusoid with amplitude $$ \frac{F_0}{\sqrt{[m (\omega_0^2 - \omega^2)]^2 + (c\omega)^2}} $$ and angular frequency $\omega$, and is called the steady-state solution. The complementary solution decays exponentially and is therefore called the transient solution.

The amplitude of $x_p$ is maximized at the practical resonance frequency $\omega = \omega_0 \sqrt{1 - 2\zeta^2} $, provided that $\zeta < 1/\sqrt{2}$.

If $\zeta \geq 1/\sqrt{2}$, there is no maximum (for $\omega > 0$), but the amplitude increases as $\omega \to 0^+$.

Higher-order scalar ODEs

$n$th-order linear homogeneous ODEs with constant coefficients

To solve an ODE of the form $\sum_{i=0}^{n} a_i y^{(i)} = 0$:

  • Compute the roots of its characteristic polynomial $\sum_{i=0}^{n} a_i \lambda^i$
  • Each real root $\lambda$ of multiplicity $k$ contributes $(\sum_{j = 0}^{k - 1} c_j x^j) e^{\lambda x}$ to the general solution
  • Each pair of complex roots $\mu \pm \omega i$ of (individual) multiplicity $k$ contributes $(\sum_{j=0}^{k-1} c_j x^j) e^{\mu x} \cos(\omega x) + (\sum_{j=0}^{k-1} d_j x^j) e^{\mu x} \sin(\omega x)$ to the general solution

There are two commonly used methods for finding a particular solution to a nonhomogeneous linear ODE with source term $f(x)$.

The method of undetermined coefficients

Suppose that $f(x) = P(x) e^{\lambda x}$ for some polynomial $P$ of degree $m$ and some $\lambda \in \C$. Given the equation $\sum_{i=0}^{n} a_i y^{(i)} = f(x)$:

  • Let $k \geq 0$ be the multiplicity of $\lambda$ as a root of its characteristic polynomial

  • Substitute the ansatz $y_p = x^k \left( \sum_{j=0}^{m} A_j x^{j} \right) e^{\lambda x}$ and match coefficients to determine the $A_j$

    • If $\lambda = \mu \pm \omega i$, the ansatz $y_p = x^k \left( \sum_{j=0}^{m} A_j x^{j} \right) e^{\mu x} \cos(\omega x) + x^k \left( \sum_{j=0}^{m} B_j x^{j} \right) e^{\mu x} \sin(\omega x)$ can be used instead (which is useful when $f(x) = P(x) \cos(\omega x)$ or $f(x) = P(x) \sin(\omega x)$, for example)
  • If $f$ is a sum of such terms, apply the principle of superposition; i.e., solve the equation with each term as the source term separately, then add the solutions

To verify that a solution set of an ODE is linearly independent, it is occasionally useful to compute the Wronskian determinant.

Wronskian (determinant) of $f_1, \dots, f_n$: $$ W(f_1, \dots, f_n)(x) = \det \begin{bmatrix} f_1(x) & \cdots & f_n(x) \\ f_1'(x) & \cdots & f_n'(x) \\ \vdots & \ddots & \vdots \\ f_1^{(n-1)}(x) & \cdots & f_n^{(n-1)}(x) \end{bmatrix} $$

Linear dependence and the Wronskian

Given a set of $(n-1) $-times differentiable functions, if the functions are linearly dependent on an interval, then their Wronskian vanishes identically thereon. 5

Of course, the contrapositive of this result is used in practice.

The method of variation of parameters

Given the equation $y^{(n)} + \sum_{i=0}^{n-1} a_i(x) y^{(i)} = f(x)$:

  • Let $y_1, \dots, y_n$ be a fundamental system of solutions to the homogeneous equation

  • Substitute the ansatz $y_p = \sum_{i=1}^{n} c_i(x) y_i$ and impose the constraints $\sum_{i=1}^{n} c_i’(x) y_i^{(j)} = 0$ for $j = 0, \dots, n-2$

  • Solve the resulting linear system $$ \begin{bmatrix} y_1 & \cdots & y_n \\ \vdots & \ddots & \vdots \\ y_1^{(n-2)} & \cdots & y_n^{(n-2)} \\ y_1^{(n-1)} & \cdots & y_n^{(n-1)} \end{bmatrix} \begin{bmatrix} c_1'(x) \\ \vdots \\ c_n'(x) \end{bmatrix} = \begin{bmatrix} 0 \\ \vdots \\ 0 \\ f(x) \end{bmatrix} $$

    • Note that the matrix is the Wronskian matrix of the fundamental system, so by Cramer’s rule, $c_i(x) = \int W_i(y_1, \dots, y_n)(x) / W(y_1, \dots, y_n)(x) \, dx$, where $W_i(y_1, \dots, y_n)$ denotes the Wronskian determinant with the $i$th column of the matrix replaced by the right-hand side

The Laplace transform

Laplace transform of $f: [0, \infty) \to \C$: $$ \lt{f}(s) = F(s) = \int_0^\infty f(t) e^{-st} \,dt $$

  • If $f \in L^1_\mathrm{loc}([0, \infty))$ and $f$ is of exponential type (that is, $f \in \mathcal{O}(e^{at})$ as $t \to \infty$ for some $a \in \R$), then $\lt{f}$ is defined for all $\Re(s) > a$

    • Moreover, $\lim_{s \to \infty} \lt{f}(s) = 0$
  • The Laplace transform is linear

  • The Laplace transform is injective in the sense that if $\lt{f} = \lt{g}$, then $f = g$ a.e. (Lerch’s theorem)

    • In particular, if $f$ and $g$ are continuous, then $f=g$ for all $t \geq 0$
  • The variables $t$ and $s$ are typically thought of as “time” and “frequency”, respectively

Inverse Laplace transform of $F: \C \to \C$: $$ \ilt{F}(t) = f(t) = \frac{1}{2\pi i} \int_{a - \infty i}^{a + \infty i} F(s) e^{st} \, ds, $$ where $a \in \R$ is such that the contour of integration (i.e., the line $\Re(s) = a$) lies in the region of convergence of $F$

  • The formula above is called Mellin’s integral formula and is derived from the Fourier inversion theorem
  • In practice, the inverse Laplace transform is computed by inspection (using tables of known Laplace transforms; see below)

Dirac delta “function”: Borel measure defined by $\delta(E) = 1_E(0)$; $\lt{\delta}$ is interpreted as $\lt{\delta}(s) = \int_0^\infty e^{-st} \, d\delta = 1$

Heaviside step function: $H(t) = 1_{[0, \infty)}(t)$ (sometimes denoted $\theta$ or $u$)

Unit step functions: $u_a(t) = H(t - a) $

We observe that $1_{[a, b]} = u_a - u_b$, which is useful for expressing piecewise functions.


When applicable, the Laplace transform can be used to solve ODEs by transforming both sides of the equation, solving for the transform of the independent variable, and computing its inverse transform.

In the tables below, we assume where necessary that $f(t) = 0$ for all $t < 0$. Thus, for instance, $t^n$ represents $t^n H(t)$ and $f(t-a)$ represents $f(t-a) u_a(t)$.

$f(t)$ $F(s)$
$\delta(t)$ $1$
$H(t)$ $1/s$
$t^n,\ n \in \Z_{\geq 0}$ $n!/s^{n+1}$
$e^{at}$ $1/(s-a)$
$\cos(kt) = \Re(e^{ikt})$ $s/(s^2 + k^2)$
$\sin(kt) = \Im(e^{ikt})$ $k / (s^2 + k^2)$
$\cosh(kt)$ $s / (s^2 - k^2)$
$\sinh(kt)$ $k/(s^2 - k^2)$

These “elementary” transforms can be combined with the general properties in the table below.

Function Laplace transform
$f^{(n)}(t)$ $s^n F(s) - \sum_{k=1}^{n} s^{n-k} f^{(k-1)}(0^-) $
$t^n f(t)$ $(-1)^n F^{(n)}(s)$
$e^{at} f(t)$ $F(s - a)$
$f(t - a)$ $e^{-as} F(s)$
$f(kt)$ $k^{-1} F(s/k)$
$k^{-1} f(t/k)$ $F(ks)$
$(f * g)(t) = \int_0^t f(t - \tau) g(\tau) \, d\tau$ $(FG)(s)$
$(fg)(t) $ $(F * G)(s) = (2\pi i)^{-1} \int_{a - \infty i}^{a + \infty i} F(s - \sigma)G(\sigma) \, d\sigma$

For example, $\lt{u_a}(s) = e^{-as}/s$ by the time shift property.

Initial value theorem

If $f: [0, \infty) \to \C$ is of exponential type and $\lim_{t \to 0^+} f(t)$ exists (and is finite), then $\lim_{t \to 0^+} f(t) = \lim_{s \to \infty} s F(s)$.

Final value theorem

If $f: [0, \infty) \to \C$ is bounded and $\lim_{t \to \infty} f(t)$ exists (and is finite), then $\lim_{t \to \infty} f(t) = \lim_{s \to 0^+} s F(s)$.

If $L$ is a linear differential operator with constant coefficients (that is, a polynomial differential operator), then the solution to $Lx = \delta(t)$ is called the impulse response of the system.

Linear ODEs with constant coefficients and the impulse response

If $L$ is a linear differential operator and $x_\delta $ is the impulse response of $Lx = \delta(t)$, then the solution to $Lx = f(t)$ is $x_\delta * f$.

(To see this, we can convolve both sides of the equation $Lx_\delta = \delta$ with $f$ and use the fact that $Lx_\delta * f = L(x_\delta * f)$ and $\delta * f = f$. We assume that $x_\delta$ is sufficiently smooth to justify the manipulation on the left-hand side.)

Power series methods

Consider the second-order linear homogeneous ODE $$ p(x) y'' + q(x) y' + r(x) y = 0, $$ where $p, q, r$ are polynomials. 6

Ordinary; singular point $x_0$: $p(x_0) \neq 0$; $p(x_0) = 0$ 7

Regular singular point $x_0$: $x_0$ is a singular point but $(x- x_0) q(x) / p(x)$ and $(x - x_0)^2 r(x) / p(x)$ tend to finite limits as $x \to x_0$ 8

At an ordinary point $x_0$, we can substitute the ansatz $y = \sum_{k=0}^\infty a_k (x - x_0)^k$ and derive a recurrence relation for the coefficients $a_k$.

At a regular singular point, we substitute the ansatz $y = \sum_{k=0}^{\infty} a_k (x - x_0)^{k + r}$. By equating the trailing coefficient (usually that of $x^r$) of the resulting series to zero, we obtain the indicial equation.

Method of Frobenius

Suppose that $x_0$ is a regular singular point of $p(x) y'' + q(x) y' + r(x) y = 0$ and that $r_1, r_2$ are the roots of the indicial equation. Then the ODE has two linearly independent solutions $y_1, y_2$ as given below.

  • If $r_1 - r_2 \notin \Z$, then $$ y_1 = \sum_{k=0}^\infty a_k (x - x_0)^{k + r_1},\quad y_2 = \sum_{k=0}^\infty b_k (x - x_0)^{k + r_2} $$

  • If $r_1 = r_2$, then $$ y_1 = \sum_{k=0}^\infty a_k (x - x_0)^{k + r_1},\quad y_2 = y_1 \ln (x - x_0) + \sum_{k=0}^\infty b_k (x - x_0)^{k + r_2} $$

  • If $r_1 - r_2 \in \Z \setminus \set{0}$, then $$ y_1 = \sum_{k=0}^\infty a_k (x - x_0)^{k + r_1},\quad y_2 = C y_1 \ln (x - x_0) + \sum_{k=0}^\infty b_k (x - x_0)^{k + r_2} $$

If $r_1, r_2 \in \C$ with $r_1 = \conj{r_2}$, we can take $\Re(y_1)$ and $\Im(y_1)$ to obtain linearly independent real solutions.

Vector ODEs

A vector ODE is simply an equation of the form $\vec{F}(x, \vec{y}, \vec{y}’, \dots, \vec{y}^{(n)}) = \vec{0}$, where $\vec{y}$ is a function of $x$. Most definitions and properties given in the section on Scalar ODEs generalize to vector ODEs in the obvious manner.

We note that a linear vector ODE takes the form $$ \sum_{i=0}^{n} A_i(x) \vec{y}^{(i)} = \vec{b}(x), $$ where the $A_i$ are matrix-valued functions and the source term $\vec{b}$ is vector-valued.

When the (vector) functions of a fundamental system of solutions to a homogeneous linear vector ODE are concatenated horizontally, the resulting matrix is called a fundamental matrix (solution).

The analogue of the phase line for first-order autonomous vector ODEs $\vec{y}' = \vec{F}(\vec{y})$ is the direction/vector field. A plot of the vector field $\vec{F}$ is made in phase space (i.e., the domain of $\vec{y}$), in contradistinction to slope fields, which are plotted in solution space. Solution curves (also called trajectories) parametrized by $x$ may also be drawn in phase space, even for non-autonomous ODEs. Such curves remain tangent to the vector field. A plot of several trajectories in phase space is called a phase portrait.


Vector ODEs are useful for expressing (systems of) scalar ODEs. An $n$th-order (scalar) ODE $F(x, y, y’, \dots, y^{(n)}) = 0 $ can equivalently be viewed as a system of $n$ 1st-order ODEs $$ \begin{cases} y_0' = y_1 \\ y_1' = y_2 \\ \;\vdots \\ y_{n-2}' = y_{n-1} \\ F(x, y_0, y_1, \dots, y_{n-1}, y_{n-1}') = 0 \end{cases}\,, $$ where $y_i = y^{(i)}$ for $i = 0, \dots, n-1$. Similarly, a system of $k$ ODEs of order $n$ (with $k$ dependent variables) can be converted to a system of $kn$ ODEs of order 1 (with $kn$ dependent variables).

We will therefore restrict our attention to systems of 1st-order ODEs, which can themselves be regarded as individual 1st-order vector ODEs. In the example above, we can define $\vec{y} = (y_0, \dots, y_{n-1})$ and $\vec{F}(x, \vec{y}, \vec{y}’) = (0, 0, \dots, F(x, \vec{y}_1, \vec{y}_2, \dots, \vec{y}_n, \vec{y}’_n))$, where $\vec{v}_i$ denotes the $i$th component of $\vec{v}$.

Linear vector ODEs

A first-order linear vector ODE (which we shall refer to interchangeably as a “linear system of ODEs”) can be written in the form $\vec{x}’ = A(t) \vec{x} + \vec{f}(t)$ (cf. first-order linear scalar ODEs). We will primarily be interested in an even more specific type of first-order linear vector ODEs: those with constant coefficients (i.e., $A(t)$ is constant).

As with second-order linear scalar ODEs, we begin with the homogeneous case.

First-order linear homogeneous vector ODEs with constant coefficients

The general solution of $\vec{x}' = A \vec{x}$ is $\vec{x} = e^{tA} \vec{c} $, where $\vec{c} = \vec{x}(0)$. In other words, $X(t) = e^{tA}$ is a fundamental matrix solution.

(See Appendix B for the definition of $e^{tA}$ and further remarks.)

Corollary:

If $X$ is a fundamental matrix solution to $\vec{x}’ = A\vec{x}$, then $e^{tA} = X(t) X^{-1}(0)$. 9

We note that $\vec{x}’ = A \vec{x}$ is an autonomous system whose sole critical point is the origin.

In general, if $\vec{x}_0$ is an isolated critical point of a first-order autonomous system (not necessarily linear), we can classify the critical point according to the behaviour of trajectories near it. We can also describe the stability of the point itself as follows.

Stable critical point: given a distance $\epsilon > 0 $ and any initial condition within a possibly smaller distance from $\vec{x}_0$, the trajectory of the system therefrom will never travel further than $\epsilon$ from $\vec{x}_0 $

Asymptotically stable critical point: $\vec{x}_0$ is stable and moreover every trajectory beginning sufficiently close to $\vec{x}_0$ converges to it

Unstable critical point: $\vec{x}_0$ is not stable

A stable critical point that is not asymptotically so is sometimes called marginally/neutrally stable.

When the system is two-dimensional and $A$ is invertible 10 and has distinct eigenvalues, the signs of the eigenvalues of $A$ determine the behaviour near and stability of the origin:

Eigenvalues of $A$ Behaviour Stability
Real and positive (Nodal) source Unstable
Real and negative (Nodal) sink A. stable
Real and of opposite signs Saddle point Unstable
Purely imaginary Centre M. stable
Complex with positive real parts Spiral source Unstable
Complex with negative real parts Spiral sink A. stable

(The “centre” is so called because trajectories are ellipses centred at the origin.)

If $A$ has a repeated eigenvalue, we further distinguish between proper and improper nodes according as the eigenvalue is nondefective or defective: 11

Eigenvalues of $A$ Behaviour Stability
Real, positive, equal, and nondefective Proper nodal source Unstable
Real, positive, equal, and defective Improper nodal source Unstable
Real, negative, equal, and nondefective Proper nodal sink A. stable
Real, negative, equal, and defective Improper nodal sink A. stable

We now consider the general (i.e., non-homogeneous) case (cf. First-order scalar ODEs).

First-order linear vector ODEs with constant coefficients

To solve a first-order linear vector ODE of the form $\vec{x}’ + P\vec{x} = \vec{f}(t)$:

  • Multiply both sides by the integrating factor $R(t) = e^{tP}$, which yields $(R(t) \vec{x})’ = R(t) \vec{f}(t)$

  • Integrate: $$ \vec{x} = {R(t)}^{-1} \int R(t) \vec{f}(t) \, dt = e^{-tP} \int e^{tP} \vec{f}(t) \, dt $$

However, the integrating factor method is not applicable when $P$ is a function of $t$, since in general $(\exp(\int P(t) \, dt))’ \neq P(t) \exp(\int P(t) \, dt)$ as matrix multiplication is noncommutative.

(For constant-coefficient linear systems, the method of undetermined coefficients may also be used, where the ‘coefficients’ are vectors. One difference is that ansatz terms augmented by powers of $t$ must be included in addition to, rather than in place of other terms. For instance, if $(0, 1) \, e^t$ is part of the complementary solution, then both $\vec{a} e^t$ and $\vec{b} te^t$ must be included in the ansatz. In general, this method is unwieldy for vector ODEs.)

To handle variable-coefficient systems, we can use variation of parameters:

First-order linear vector ODEs with variable coefficients

To solve a first-order linear vector ODE of the form $\vec{x}’ = A(t)\vec{x} + \vec{f}(t)$:

  • Let $X(t)$ be a fundamental matrix solution to the homogeneous equation

  • Substitute the ansatz $\vec{x}_p = X(t) \vec{c}(t)$, which yields $X(t) \vec{c}’(t) = \vec{f}(t)$

  • Integrate: $$ \vec{x}_p = X(t) \int X(t)^{-1} \vec{f}(t) \, dt $$

(Note that this reduces to the integrating factor method when $A(t) \equiv -P$ and $X(t) = e^{-tP}$.)


When $A$ is diagonalizable, we can solve first- and second-order linear vector ODEs of the form $\vec{x}'' = A \vec{x} + \vec{f}(t)$ by exploiting the eigendecomposition of $A$.

Linear vector ODEs with constant coefficients

To solve a first- or second-order linear vector ODE of the form $\vec{x}^{(\nu)} = A\vec{x} + \vec{f}(t)$, where $A$ is diagonalizable:

  • Let $(\lambda_1, \vec{v}_1), \dots, (\lambda_n, \vec{v}_n)$ be the eigenpairs of $A$

  • Write $\vec{x}(t) = \sum_{i=1}^{n} \vec{v}_i \xi_i(t)$ and $\vec{f}(t) = \sum_{i=1}^{n} \vec{v}_i \phi_i(t)$

  • Solve for the $\phi_i$ (in the linear system $\begin{bmatrix} \vec{v}_1 & \cdots & \vec{v}_n \end{bmatrix} \begin{bmatrix} \phi_1 & \cdots & \phi_n \end{bmatrix}^T = \vec{f}$)

  • Substitute these decompositions into the equation and equate the coefficients of the eigenvectors, which yields $n$ decoupled first- or second-order linear scalar ODEs $$ \begin{align*} \xi_1^{(\nu)} &= \lambda_1 \xi_1 + \phi_1, \\ &\;\; \vdots \\ \xi_n^{(\nu)} &= \lambda_n \xi_n + \phi_n, \end{align*} $$

Nonlinear vector ODEs

Suppose that $\vec{x}_0$ is an isolated critical point of the first-order autonomous vector ODE $\vec{x}' = \vec{F}(\vec{x})$ (i.e., $\vec{F}(\vec{x}_0) = \vec{0}$). By the Taylor expansion of $\vec{F} $ about $\vec{x}_0$, we have $$ \vec{x}' \approx \vec{F}(\vec{x}_0) + d\vec{F}_{\vec{x}_0} (\vec{x} - \vec{x}_0) = d\vec{F}_{\vec{x}_0} (\vec{x} - \vec{x}_0). $$ In other words, near $\vec{x}_0$, the nonlinear ODE behaves like the constant-coefficient linear ODE $\vec{v}’ = d\vec{F}_{\vec{x}_0} \vec{v}$ – its linearization at $\vec{x}_0$ – provided that the linearized system itself has an isolated critical point (see Linear vector ODEs). In this case, the nonlinear ODE is called almost linear at $\vec{x}_0$.

However, we note that a centre of the linearized system tends to correspond to a spiral point in the nonlinear ODE, since it is unlikely that the real parts of the eigenvalues of $d\vec{F}$ all vanish in a whole neighbourhood of $\vec{x}_0$. (Similarly, a node arising from a repeated real eigenvalue could also correspond to a spiral point.)


Conservative equation: equation of the form $x''(t) + f(x) = 0$

A conservative equation can be written as the first-order autonomous vector ODE $$ \underbrace{ \begin{bmatrix} x \\ y \end{bmatrix}}_{\vec{x}}' = \underbrace{ \begin{bmatrix} y \\ -f(x) \end{bmatrix}}_{\vec{F}(\vec{x})}, $$ where $y = x’$.

By integrating both sides of the scalar equation with respect to $x$, we find that $$ \begin{align*} C = \int x'' \, dx + \int f(x) \, dx &= \int y \frac{dy}{dx} \, dx + \int f(x) \, dx \\ &= \int y \, dy + \int f(x) \, dx \\ &= \frac{1}{2} y^2 + \int f(x) \, dx \end{align*} $$ since $x'' = y’ = dy/dx \cdot dx/dt$. Hence the quantity $\frac{1}{2} y^2 + \int f(x) \, dx$ is conserved for any solution, which can be used to find trajectories in phase (i.e., $xy$) space. 12

The critical points are clearly the points on the $x$-axis where $f(x) = 0$. Since $$ d\vec{F} = \begin{bmatrix} 0 & 1 \\ -f'(x) & 0 \end{bmatrix}, $$ a critical point is almost linear when $f’(x) \neq 0$. The eigenvalues of $d\vec{F}$ at such a point are $\pm \sqrt{-f’(x)}$, so the point is a saddle point if $f’(x) < 0$ and a centre if $f’(x) > 0$. 13 14

Partial differential equations (PDEs)

Separation of variables

Boundary-value problems

Let $A: \mathcal{D}(A) \subseteq L^2([a, b]) \to L^2([a, b])$ be the linear operator given by $A = -\frac{d^2}{dt^2}$. We will take $\mathcal{D}(A)$ to be some subspace of $C^2([a, b])$ of functions satisfying certain boundary conditions (BCs). The three main types are:

  • Dirichlet boundary conditions: $x(a) = x(b) = 0$
  • Neumann boundary conditions: $x’(a) = x’(b) = 0$
  • Periodic boundary conditions: $x(a) = x(b),\, x’(a) = x’(b)$

Dirichlet and Neumann boundary conditions can also be mixed; e.g., $x’(a) = 0,\, x(b) = 0$ are “Neumann-Dirichlet” boundary conditions.

All such conditions ensure that $A$ is a symmetric operator (by integrating by parts, we see that $A$ is symmetric if and only if $x_1’ x_2 - x_1 x_2’ |_a^b = 0 $ for all $x_1, x_2 \in \mathcal{D}(A)$).

The eigenvectors of $A$ are the nontrivial solutions to the boundary value problem (BVP) $x'' + \lambda x = 0$, where $x$ satisfies one of the boundary conditions above. In this context, they are called eigenfunctions. As $A$ is symmetric, its eigenvalues are real, and eigenfunctions of distinct eigenvalues are orthogonal.

Boundary conditions Eigenfunctions Eigenvalues
Dirichlet on $[0, L]$ $\sin(\sqrt{\lambda_n} x)$ $\lambda_n = (n \pi / L)^2,\, n \geq 1$
Neumann on $[0, L]$ $\cos(\sqrt{\lambda_n} x)$ $\lambda_n = (n\pi / L)^2,\, n \geq 0$
Dirichlet-Neumann on $[0, L]$ $\sin(\sqrt{\lambda_n} x)$ $\lambda_n = ((n-1/2) \pi / L)^2,\, n \geq 1 $
Neumann-Dirichlet on $[0, L]$ $\cos(\sqrt{\lambda_n} x)$ $\lambda_n = ((n-1/2) \pi / L)^2,\, n \geq 1$
Periodic on $[-L, L]$ $\sin(\sqrt{\lambda_n} x);\; \cos(\sqrt{\mu_n} x)$ $\lambda_n = (n\pi / L)^2,\, n \geq 1;\; \mu_n = (n\pi / L)^2,\, n \geq 0$

Fredholm alternative

For any given $\lambda$, either:

  • $\lambda$ is an eigenvalue of $A$
  • $A - \lambda I = f$ has a unique solution for every $f \in C([a, b])$

Trigonometric series

Fourier series of $f \in L^2([-L, L])$: $$ \frac{1}{2} a_0 + \sum_{n=1}^{\infty} a_n \cos\left(\frac{n\pi}{L} t\right) + b_n \sin\left(\frac{n\pi}{L} t\right), $$ where $$ \begin{align*} a_0 &= 2 \frac{\inner{f}{1}}{\norm{1}^2} &&= \frac{1}{L} \int_{-L}^L f(t) \, dt, \\ a_n &= \frac{\inner{f}{\cos(n\pi t/L)}}{\norm{\cos(n\pi t / L)}^2} &&= \frac{1}{L} \int_{-L}^L f(t) \cos\left(\frac{n\pi}{L} t\right) \, dt, \\ b_n &= \frac{\inner{f}{\sin(n\pi t/L)}}{\norm{\sin(n\pi t / L)}^2} &&= \frac{1}{L} \int_{-L}^L f(t) \sin\left(\frac{n\pi}{L} t\right) \, dt, \end{align*} $$ and the inner product is $\inner{f}{g} = \int_{-L}^{L} f(t) g(t) \, dt$

Periodic extension of $f: (-L, L] \to \R$: the function $F: \R \to \R$ given by $F(t) = f(t - 2kL)$ for $t \in (-L, L] + 2kL,\, k \in \Z $

Convergence of Fourier series

If the periodic extension $F$ of $f$ is piecewise smooth, then the Fourier series of $f$ converges pointwise to $[F(t^-) + F(t^+)]/2$ for every $t$.

Differentiation and antidifferentiation of Fourier series

Suppose that the Fourier series of $f$ is piecewise smooth.

If $f$ is continuous and $f’$ is piecewise smooth, then the Fourier series of $f’$ may be computed by differentiating that of $f$ term-wise.

Similarly, an antiderivative of $f$ may be computed by antidifferentiating the Fourier series of $f$ term-wise. (N.B.: In general, the resulting series will not be a Fourier series.)

Decay of Fourier series coefficients

If $f \in C^k([a, b])$, then $a_n, b_n \in \mathcal{O}(n^{-(k+2)})$ as $n \to \infty$.

Parseval’s identity $$ \norm{f}^2 = \frac{a_0^2}{4} + \sum_{n=1}^\infty \frac{a_n^2 + b_n^2}{2}. $$


Even extension of $f: [0, L] \to \R$: the function $F_\mathrm{even}: (-L, L] \to \R$ given by $F_\mathrm{even}(t) = \begin{cases} f(t), & t \in [0, L] \\ f(-t), & t \in (-L, 0) \end{cases}$

Odd extension of $f: [0, L] \to \R$: the function $F_\mathrm{odd}: (-L, L] \to \R$ given by $F_\mathrm{odd}(t) = \begin{cases} f(t), & t \in [0, L] \\ -f(-t), & t \in (-L, 0) \end{cases}$

The periodic extensions of $F_\mathrm{even}$ and $F_\mathrm{odd}$ are called the even and odd periodic extensions of $f$, respectively.

Fourier cosine series of $f \in L^2([0, L])$: the Fourier series of its even extension, viz., $$ \frac{1}{2} a_0 + \sum_{n=1}^{\infty} a_n \cos\left(\frac{n\pi}{L} t\right),\quad a_n = \frac{2}{L} \int_{0}^L f(t) \cos\left(\frac{n\pi}{L} t\right) \, dt $$ Fourier sine series of $f \in L^2([0, L])$: the Fourier series of its odd extension, viz., $$ \sum_{n=1}^{\infty} b_n \sin\left(\frac{n\pi}{L} t\right),\quad b_n = \frac{2}{L} \int_{0}^L f(t) \sin\left(\frac{n\pi}{L} t\right) \, dt $$ Thus, by definition, the Fourier series of an even (resp., odd) function coincides with its Fourier cosine (resp., sine) series. As above, if the even (resp., odd) periodic extension of $f$ is piecewise smooth, then the Fourier cosine (resp., sine) series of $f$ converges pointwise a.e. to $F_\mathrm{even}$ (resp., $F_\mathrm{odd}$).


To solve a BVP of the form $x'' + \lambda x = f(t)$ (where $\lambda$ is not an eigenvalue of the homogeneous problem), we can expand $x$ and $f$ as Fourier sine, Fourier cosine, or Fourier series according as the BCs are Dirichlet, Neumann, or periodic. We then equate coefficients and solve for those of $x$ as with power series.

Fourier series are also effective in solving periodically forced harmonic oscillator equations. If necessary, resonant terms in the Fourier series are multiplied by $t$.

Second-order linear PDEs

Every second-order linear PDE in two independent variables $x, y$ can be written in the form $Au_{xx} + 2Bu_{xy} + Cu_{yy} + \cdots + G = 0$, where $A, B, C, \dots, G$ are functions of $x$ and $y$.

  • Elliptic PDE: $B^2 - AC < 0$
    • Describes an ‘equilibrium state’
    • Obeys a ‘maximum principle’; smooths out singularities
    • Ex.: Laplace’s equation $u_{xx} + u_{yy} = 0$, Poisson’s equation $u_{xx} + u_{yy} = f(x, y)$
  • Parabolic PDE: $B^2 - AC = 0$
    • Describes ‘diffusion’
    • Obeys a ‘maximum principle’; smooths out singularities
    • Ex.: the heat equation $u_t = \alpha u_{xx},\, \alpha > 0 $
  • Hyperbolic PDE: $B^2 - AC > 0$
    • Describes ‘wave propagation’
    • Ex.: the wave equation $u_{tt} = c^2 u_{xx},\, c > 0$

The one-dimensional heat equation

To solve the PDE $u_t = \alpha u_{xx},\, \alpha > 0$ with boundary conditions $\mathrm{(BC)}$ and initial condition $u(x, 0) = f(x)\; \mathrm{(IC)}$:

  • Substitute the ansatz $u(x, t) = X(x) T(t)$, which yields $\frac{T’}{\alpha T} = \frac{X''}{X} \equiv -\lambda$
  • Solve the BVP $X'' + \lambda X = 0$ with $\mathrm{(BC)}$ to obtain eigenfunctions $X_n$ with eigenvalues $\lambda_n$
  • Solve the ODEs $T_n’ = -\lambda_n \alpha T_n$, which yield $T_n = \exp(-\lambda_n \alpha t)$
  • Write $u(x, t) = \sum_n c_n u_n(x, t)$, where $u_n(x, t) = X_n(x) T_n(t)$
  • Impose $\mathrm{(IC)}$ and match the coefficients with those of the appropriate trigonometric series for $f$

The technique of writing $u$ as a product of functions in each independent variable is called separation of variables.

The one-dimensional inhomogeneous heat equation

To solve the PDE $u_t = \alpha u_{xx} + q(x),\, \alpha > 0$ with boundary conditions $\mathrm{(BC)}$ and initial condition $u(x, 0) = f(x)\; \mathrm{(IC)}$:

  • Write $u(x, t) = \tilde{u}(x) + v(x, t)$, where $\tilde{u}$ and $v$ are the steady-state and transient parts of $u$, respectively
  • Solve $\alpha \tilde{u}_{xx} + q(x) = 0$ with boundary conditions depending on $\mathrm{(BC)}$
  • Solve $v_t = \alpha v_{xx}$ with boundary conditions depending on $\mathrm{(BC)}$ and initial condition $v(x, 0) = f(x) - \tilde{u}(x)$

For instance, if $\mathrm{(BC)}$ is $u_x(0, t) = a,\, u(L, t) = b $, then $\tilde{u}_x(0) = a,\, \tilde{u}(L) = b$ and $v_x(0, t) = v(L, t) = 0$.

The one-dimensional wave equation

To solve the PDE $y_{tt} = a^2 y_{xx},\, a > 0$ with boundary conditions $\mathrm{(BC)}$ and initial conditions $y(x, 0) = f(x)\; \mathrm{(IC_0)},\, y_t(x, 0) = g(x)\; \mathrm{(IC_1)}$:

  • Write $y(x, t) = w(x, t) + z(x, t)$
  • Solve $w_{tt} = a^2 w_{xx}$ with side conditions $\mathrm{(BC)}, \mathrm{(IC_0')}, \mathrm{(IC_1)}$ by separation of variables, where $\mathrm{(IC_0')}$ is $w(x, 0) = 0$
  • Solve $z_{tt} = a^2 z_{xx}$ with side conditions $\mathrm{(BC)}, \mathrm{(IC_0)}, \mathrm{(IC_1')}$ by separation of variables, where $\mathrm{(IC_1')}$ is $z_t(x, 0) = 0$

Laplace’s equation in two dimensions

To solve the PDE $u_{xx} + u_{yy} = 0$ on a rectangle with boundary conditions:

  • Write $u(x, y) = u_N(x, y) + u_E(x, y) + u_S(x, y) + u_W(x, y)$, where the subscripts denote the ‘north’, ‘east’, ‘west’, and ‘south’ sides of the rectangle
  • For each side function $\tilde{u}$, solve $\tilde{u}_{xx} + \tilde{u}_{yy} = 0$ by separation of variables with the boundary conditions for all other sides set to zero

Note: it is convenient to use hyperbolic functions in the second step.

A similar method can be used to solve Laplace’s equation in a semi-infinite strip; however, exponential functions should then be used instead of hyperbolic functions. In this case, it is also assumed that the solution is bounded in the strip.

Separation of variables is also applicable to Laplace’s equation in polar coordinates, $\frac{1}{r^2} u_{\theta \theta} + \frac{1}{r} u_r + u_{rr} = 0$. In this context, a second-order Cauchy-Euler equation arises (for the radial problem), whose solutions are recorded below. Again, we assume that the solution is bounded on its domain.

The second-order Cauchy-Euler equation

To solve an ODE of the form $ax^2 y'' + bxy’ + cy = 0$:

  • Compute the roots $r_1, r_2$ of its indicial equation $ar(r-1) + br + c = 0$
  • If $r_1, r_2$ are real and distinct, the general solution is $y = c_1 x^{r_1} + c_2 x^{r_2}$
  • If $r_1, r_2 $ are real and equal, the general solution is $y = c_1 x^{r_1} + c_2 x^{r_1} \ln \abs{x}$
  • If $r_1, r_2$ are complex with $r_{1, 2} = \alpha \pm \beta i$, the general solution is $y = c_1 x^\alpha \cos(\beta \ln \abs{x}) + c_2 x^{\alpha} \sin(\beta \ln \abs{x})$

Integral transform methods

The heat equation $u_t = \alpha u_{xx},\, \alpha > 0$ on the line $\R = (-\infty, \infty)$ with initial condition $u(x, 0) = f(x)$ can be solved using the Fourier transform.

Taking the Fourier transform in $x$ of the equation and initial condition, we obtain $\hat{u}_t = \alpha (2\pi i \xi)^2 \hat{u}$ for $t > 0$ and $\hat{u} = \hat{f}$ for $t = 0$. This is just a first-order linear ODE in $t$ with solution $\hat{u} = e^{\alpha (2\pi i \xi)^2 t} \hat{f} = e^{-4\pi^2\xi^2 \alpha t} \hat{f}$. Taking the inverse transform then yields $u = \mathcal{F}^{-1} \set{e^{-4\pi^2\xi^2 \alpha t}} * f$.

The function $$ \Phi(x, t) = \mathcal{F}^{-1} \set{e^{-4\pi^2\xi^2 \alpha t}}(x, t) = \frac{1}{\sqrt{4\pi\alpha t}} \exp \left(-\frac{x^2}{4\alpha t}\right) $$ is called the fundamental solution of the 1D heat equation (or the 1D heat kernel). Thus, the solution formula for the 1D heat equation is $$ u(x, t) = \int_\R \Phi(x-y, t)f(y) \, dy = \frac{1}{\sqrt{4\pi\alpha t}} \int_\R \exp \left(-\frac{(x-y)^2}{4\alpha t}\right) f(y) \, dy. $$

Eigenvalue problems

Sturm-Liouville theory

Regular Sturm-Liouville problem: BVP of the form $$ \begin{align*} [p(x) y']' + q(x) y &= -\lambda w(x) y, && x \in [a, b]; \\ \alpha_1 y(a) + \alpha_2 y'(a) &= 0, \\ \beta_1 y(b) + \beta_2 y'(b) &= 0, \end{align*} $$ where $$ \begin{align*} p(x), w(x) &> 0, && x \in [a, b]; \\ p, p', q, w &\in C([a, b]); \\ \alpha_1^2 + \alpha_2^2, \beta_1^2 + \beta_2^2 &> 0. \end{align*} $$ (The last condition simply ensures that the $\alpha_i$ are not both zero, and likewise for the $\beta_i$.)

Solutions of regular Sturm-Liouville problems

Every regular Sturm-Liouville problem has a strictly increasing sequence $\set{\lambda_n}$ of real eigenvalues tending to infinity. Moreover, each $\lambda_n$ is simple and its eigenspace is spanned by an eigenfunction $y_n$ with exactly $n-1$ zeroes in $(a, b)$.

In addition, if $q(x) \geq 0$ on $[a, b]$, $\alpha_1 / \alpha_2 \leq 0$, and $\beta_1/\beta_2 \geq 0$, then the eigenvalues are nonnegative.

Any second-order eigenvalue problem of the form $$ a(x) y'' + b(x) y’ + c(x) y = -\lambda y $$ can be converted to Sturm-Liouville form by multiplying both sides by the integrating factor $\mu = \frac{1}{a} \exp(\int \frac{b}{a} \, dx)$. This yields $p = a\mu$, $q = c\mu$, and $w = \mu$.


The underlying linear differential operator of an SLP is the operator $L$ defined by $Ly = -\frac{1}{w} [(p y’)’ + qy]$. As with the second derivative operator (defined in Boundary-value problems), $L$ is symmetric with respect to the weighted $L^2$ inner product $\inner{f}{g} = \int_a^b f(x) g(x) w(x) \, dx$ for functions satisfying the boundary conditions of the SLP (indeed, the former is a special case of $L$ with $p \equiv 1$, $q \equiv 0$, and $w \equiv 1$). Likewise, we have:

Fredholm alternative

For any given $\lambda$, either:

  • $\lambda$ is an eigenvalue of $L$
  • $L - \lambda I = f$ has a unique solution for every $f \in C([a, b])$

Eigenfunction series

Just as square-integrable functions admitted trigonometric series expansions, $w$-weighted square-integrable functions admit eigenfunction series expansions. Namely, for $f \in L^2([a, b], w \, dx)$, we can define $$ \sum_{n=1}^{\infty} c_n y_n(x), $$ where $$ c_n = \frac{\inner{f}{y_n}}{\norm{y_n}^2} = \frac{\int_a^b f(x) y_n(x) w(x) \, dx}{\int_a^b [y_n(x)]^2 w(x) \, dx} $$ and the $y_n$ are the eigenfunctions of a regular SLP.

If $f$ is continuous and piecewise smooth, then such a series converges pointwise to $f$ on $[a, b]$.

Appendix A: The Jordan normal form

Let $V$ denote an $n$-dimensional $\F$-vector space, where $\F \in \set{\R, \C}$. $T$ will denote an arbitrary endomorphism of $V$ unless otherwise specified.

The characteristic and minimal polynomials

Characteristic polynomial of $T$: $\chi_T(x) = \det(xI - T)$

$\chi_T$ thus defined is a monic polynomial of degree $n$ whose roots are the eigenvalues of $T$.

Cayley-Hamilton theorem $$ \chi_T(T) = 0. $$

Minimal polynomial of $T$: the (nonzero) monic polynomial $\mu_T \in \F[x] $ of minimal degree satisfying $\mu_T(T) = 0$ 15

Invertibility and the minimal polynomial

$T$ is invertible if and only if $\mu_T(0) \neq 0$.

Corollary:

Eigenvalues and the minimal polynomial

$\lambda$ is an eigenvalue of $T$ if and only if $\mu_T(\lambda) = 0$. 16

Generalized eigenvectors

(Rank-$k$) generalized eigenvector of $T$ of eigenvalue $\lambda$: vector $\vec{v}$ such that $(T - \lambda I)^k \vec{v} = \vec{0}$ but $(T - \lambda I)^{k-1} \vec{v} \neq \vec{0}$

Jordan chain generated by the rank-$k$ generalized eigenvector $\vec{v}$ of $T$ of eigenvalue $\lambda$: $\vec{v}_k, \vec{v}_{k-1}, \dots, \vec{v}_1$, where $\vec{v}_j = (T - \lambda I)^{k-j} \vec{v}$ (note that $\vec{v}_j$ is a rank-$j$ generalized eigenvector)

Generalized eigenspace of $T$ for the eigenvalue $\lambda$ (denoted $G(\lambda, T)$): the set of all generalized eigenvectors of $T$ of eigenvalue $\lambda$, along with $\vec{0}$

Recall that the geometric multiplicity of an eigenvalue $\lambda$ is $\dim(E(\lambda, T))$, where $E(\lambda, T) = \ker(T-\lambda I)$ is the (ordinary) eigenspace of $T$ for $\lambda$. The algebraic multiplicity of $\lambda$ is its multiplicity as a root of $\chi_T$, but also admits a characterization in terms of eigenspaces.

The algebraic multiplicity of an eigenvalue $\lambda$ of $T$ is $\dim(G(\lambda, T))$.

It is easy to see that $G(\lambda, T) = \ker((T - \lambda I)^n)$ and thence that $G(\lambda, T)$ is a $T$-invariant subspace. Note also that $(T - \lambda I) \restriction_{G(\lambda, T)}$ is nilpotent by definition.

The Jordan normal form

Suppose that $\mathbb{F} = \C$ and let $\lambda_1, \dots, \lambda_m$ be the distinct eigenvalues of $T$. Then $V = \oplus_{i=1}^{m} G(\lambda_i, T)$.

Thus, if $m_i$ is the algebraic multiplicity of $\lambda_i$, there is a basis of $V$ for which the matrix of $T$ is of the form $$ \begin{bmatrix} A_1 & & \\ & \ddots & \\ & & A_m \end{bmatrix}, $$ where $A_i$ is the ($m_i \times m_i$) matrix of $T \restriction_{G(\lambda_i, T)}$.

Now if $N \in \End(V)$ is nilpotent, there is a basis of $V$ for which its matrix is strictly upper triangular: namely, the concatenation of bases for $\ker(N), \ker(N^2), \dots$. Hence it is possible to write each $A_i$ in the form $$ \begin{bmatrix} \lambda_i & & * \\ & \ddots & \\ & & \lambda_i \end{bmatrix}, $$ since $T \restriction_{G(\lambda_i, T)} = (T - \lambda_i I) \restriction_{G(\lambda_i, T)} + (\lambda_i I) \restriction_{G(\lambda_i, T)}$. But a stronger statement concerning the structure of nilpotent operators can be made.

If $N \in \End(V)$ is nilpotent, there exist vectors $\vec{v}_1, \dots, \vec{v}_p \in V$ and integers $k_1 \geq \dots \geq k_p \geq 0$ such that $$ \begin{align*} N^{k_1} \vec{v}_1, N^{k_1 - 1}\vec{v}_1, &\dots, \vec{v}_1, \\ & \dots, \\ N^{k_p} \vec{v}_p, N^{k_p - 1}\vec{v}_p, &\dots, \vec{v}_p \end{align*} $$ is a basis for $V$, where $N^{k_1 + 1} \vec{v}_1 = \dots = N^{k_p + 1} \vec{v}_p = \vec{0}$.

Taking $N = (T - \lambda_i I) \restriction_{G(\lambda_i, T)}$ above (with $V = G(\lambda_i, T)$), we find that $A_i$ can itself be written in the form $$ \begin{bmatrix} A_{i, 1} & & \\ & \ddots & \\ & & A_{i, p} \end{bmatrix}, $$ where $A_{i, j}$ is a $k_j \times k_j$ matrix of the form $$ \begin{bmatrix} 0 & 1 & & \\ & \ddots & \ddots & \\ & & \ddots & 1 \\ & & & 0 \end{bmatrix} + \lambda_i I. $$ Hence there is a basis for which the matrix of $T$ is block diagonal, with each block being of the form $$ \begin{bmatrix} \lambda_i & 1 & & \\ & \ddots & \ddots & \\ & & \ddots & 1 \\ & & & \lambda_i \end{bmatrix} \tag{$*$} $$ for some eigenvalue $\lambda_i$. These blocks are called Jordan blocks, and the resulting form of the matrix of $T$ is called its Jordan normal (or canonical) form. In other words, $ T = M J M^{-1}$, where $M$ is an invertible matrix and $J$ is a Jordan matrix (a block diagonal matrix of Jordan blocks).

As for the columns of $M$, suppose that the basis vectors for the Jordan block $(*)$ are $\vec{v}_1, \dots, \vec{v}_{k_j}$. Then $T\vec{v}_1 = \lambda_i \vec{v}_1, T\vec{v}_2 = \vec{v}_1 + \lambda_i \vec{v}_2, \dots, T\vec{v}_{k_j} = \vec{v}_{k_j - 1} + \lambda_i \vec{v}_{k_j}$, so the basis is a (reversed) Jordan chain generated by the rank-$k_j$ generalized eigenvector $\vec{v}_{k_j}$ of eigenvalue $\lambda_i$. Consequently, $M$ consists of Jordan chains of generalized eigenvectors of $T$, with one chain for each Jordan block (N.B.: there may be multiple blocks for a given eigenvalue!).

Another useful observation is that $\mathrm{null}((T - \lambda I)^k) - \mathrm{null}((T - \lambda I)^{k-1})$ is the number of linearly independent generalized eigenvectors of rank $k$, and is therefore the number of Jordan blocks of size $\geq k$.

Eigenvalues and the Jordan normal form

The number of Jordan blocks with $\lambda_i$’s on their diagonals is the geometric multiplicity of $\lambda_i$; the sum of their sizes is the algebraic multiplicity of $\lambda_i$.

(If these are equal, each Jordan block for $\lambda_i$ must be $1 \times 1$, and the eigenvalue is called semisimple. $T$ is diagonalizable if and only if all its eigenvalues are semisimple.)

The minimal polynomial and the Jordan normal form

The minimal polynomial of $T$ is $\prod_{i=1}^{m} (x - \lambda_i)^{k_{i, 1}} $, where $k_{i, 1}$ is the size of the largest Jordan block for $\lambda_i$ (also called the index of $\lambda_i$).

Appendix B: The matrix exponential

If $A \in \F^{n \times n}$, where $\F \in \set{\R, \C}$, the matrix exponential of $A$ is defined as $$ e^A = \sum_{k=0}^{\infty} \frac{A^k}{k!}. $$ Note that $e^0 = I$ and that $e^{MBM^{-1}} = Me^{B}M^{-1}$ for all invertible matrices $M$. We also have $e^A e^B = e^{A+B}$ if $A$ and $B$ commute.

Clearly, if $\Lambda = \mathrm{diag}(\lambda_1, \dots, \lambda_n)$, then $e^\Lambda = \mathrm{diag}(e^{\lambda_1}, \dots, e^{\lambda_n})$. From this, the matrix exponential of any diagonalizable matrix is readily computed.

For the general case, it suffices to compute $e^J$, where $J$ is the Jordan normal form of $A$. Since $J$ is a direct sum of Jordan blocks, we can compute the exponential of each block separately and take the direct sum of the results. But each Jordan block is of the form $\lambda I + N$, where $N$ is nilpotent, and $e^{\lambda I + N} = e^{\lambda I} e^N = e^\lambda e^N$. If the block size (i.e., the index of nilpotence of $N$) is $m$, we have $$ \begin{align*} e^N &= \sum_{k=0}^{m-1} \frac{N^k}{k!} \\ &= \frac{1}{0!} I + \frac{1}{1!} \begin{bmatrix} 0 & 1 & & \\ & \ddots & \ddots & \\ & & \ddots & 1 \\ & & & 0 \end{bmatrix} + \cdots + \frac{1}{(m-1)!} \begin{bmatrix} 0 & & & 1 \\ & \ddots & & \\ & & \ddots & \\ & & & 0 \end{bmatrix} \\ &= \begin{bmatrix} 1 & \frac{1}{1!} & \cdots & \frac{1}{(m-1)!} \\ & \ddots & \ddots & \vdots \\ & & \ddots & \frac{1}{1!} \\ & & & 1 \end{bmatrix}. \end{align*} $$ The Jordan normal form may also be used to compute $e^{tA}$, simply by computing $$ e^{t(\lambda I + N)} = e^{t\lambda} e^{tN} = e^{t\lambda} \sum_{k=0}^{m-1} \frac{(tN)^k}{k!} = e^{t\lambda} \begin{bmatrix} 1 & \frac{t}{1!} & \cdots & \frac{t^{m-1}}{(m-1)!} \\ & \ddots & \ddots & \vdots \\ & & \ddots & \frac{t}{1!} \\ & & & 1 \end{bmatrix} $$ for each Jordan block $\lambda I + N$. (Of course, if $A$ is diagonalizable, each Jordan block is of the form $\begin{bmatrix} \lambda \end{bmatrix}$, and $e^{t \begin{bmatrix} \lambda \end{bmatrix}} = \begin{bmatrix} e^{t\lambda} \end{bmatrix}$.)


When the general solution to $\vec{x}’ = A\vec{x}$ is sought, it suffices to compute $Me^J$ (when $A = M J M^{-1}$), since the invertible matrix $M^{-1}$ may be absorbed into the constant $\vec{c}$, i.e., $\vec{x} = e^{tA} \vec{c} = Me^J \tilde{\vec{c}}$.

However, the solution in this form may involve complex coefficients and functions even when $A$ is real. In general, one could use the real Jordan normal form.

For a simple (or, more generally, a nondefective) complex eigenvalue of $A$, explicit formulae are relatively simple to state. Since $A$ is real, its complex eigenvalues and eigenvectors come in conjugate pairs. Given an eigenpair $(\lambda, \vec{v})$ of $A$ with $\lambda \notin \R$, we can replace the solutions $\vec{v} e^{t\lambda}, \conj{\vec{v}}e^{t\conj{\lambda}}$ with $\Re(\vec{v} e^{t \lambda}), \Im(\vec{v} e^{t \lambda})$. Using Euler’s formula, we obtain $$ \begin{align*} \Re(\vec{v} e^{t \lambda}) &= \vec{a} e^{\alpha t} \cos(\beta t) - \vec{b} e^{\alpha t} \sin(\beta t) \\ \Im(\vec{v} e^{t \lambda}) &= \vec{a} e^{\alpha t} \sin(\beta t) + \vec{b} e^{\alpha t} \cos(\beta t) \end{align*}\,, $$ where $\vec{v} = \vec{a} + \vec{b}i$ and $\lambda = \alpha + \beta i$. (Incidentally, this can be written as $$ \begin{align*} \begin{bmatrix} \Re(\vec{v} e^{t \lambda}) \\ \Im(\vec{v} e^{t \lambda}) \end{bmatrix} = (e^{\alpha t} R(\beta t) \otimes I_n) \begin{bmatrix} \Re(\vec{v}) \\ \Im(\vec{v}) \end{bmatrix}, \end{align*} $$ where $R(\beta t)$ denotes the $2 \times 2$ matrix for counterclockwise rotation by $\beta t$.)


For reference and illustration, we also note what happens in the case of a double eigenvalue $\lambda$ with defect 1 (i.e., geometric multiplicity $2 - 1 = 1$). If $\vec{v}_2, \vec{v}_1$ is a Jordan chain for the eigenvalue $\lambda$, we obtain the solutions $\vec{v}_1 e^{\lambda t}, (\vec{v}_2 + \vec{v}_1 t) e^{\lambda t}$.


  1. This is because the space of solutions is the kernel of a linear differential operator! ↩︎

  2. If $\mu = -b/2a$ and $\omega = \sqrt{b^2 - 4ac}/{2a}$ so that $\lambda_{1, 2} = \mu \pm \omega$, the general solution in this case can be written as $y = c_1 e^{\mu x} \cosh(\omega x) + c_2 e^{\mu x} \sinh(\omega x)$, which unifies the real and complex cases. In fact, if $\lambda_1 > \lambda_2$, we have $\mu = (\lambda_1 + \lambda_2)/2$ and $\omega = \mathrm{sgn}(a)(\lambda_1 - \lambda_2)/2$ (where, by symmetry, the factor of $\mathrm{sgn}(a)$ is immaterial in $\cosh$ and may be absorbed into $c_2$ for $\sinh$). ↩︎

  3. The (ordinary) frequency is $f = \omega_0 / 2\pi$ (cycles per unit time) and the period is $T = 1/f = 2\pi / \omega_0$. ↩︎

  4. We reserve the variable $\omega$ for the angular frequency of a sinusoidal forcing function (see below) and therefore use $\omega_1$ where $\omega$ was previously. ↩︎

  5. The converse, however, is false: $f_1(x) = x^2$ and $f_2(x) = x\abs{x}$ are (continuously) differentiable on $\R$ and their Wronskian vanishes identically thereon, yet they are not linearly dependent on any neighbourhood of the origin. ↩︎

  6. More generally, $p, q, r$ can be meromorphic functions. ↩︎

  7. More generally, an ordinary point is one at which $p, q, r$ are analytic↩︎

  8. More generally, a regular singular point is a singular point at which $q$ has a pole of order $\leq 1$ and $r$ has a pole of order $\leq 2$. ↩︎

  9. Given that $\vec{x} = X(t) \vec{c}$, solving for $\vec{c}$ in terms of $\vec{x}(0)$ yields $\vec{x}(t) = X(t) X^{-1}(0) \vec{x}(0)$, whence the result follows. ↩︎

  10. We assume that $A$ is invertible (or equivalently, that both its eigenvalues are nonzero) so that the origin is an isolated critical point of the system. Indeed, if there were even one other critical point, by linearity there would be infinitely many (constituting a subspace of the plane), so the origin is an isolated critical point if and only if it is the sole critical point. ↩︎

  11. When the eigenvalues are distinct and of the same sign, nodes are sometimes also called “improper” owing to their graphical similarity to the latter case. ↩︎

  12. As an example, a free undamped mass-spring system obeys the conservative equation $x'' +kx/m = 0$. The quantity $(x’)^2 / 2 + kx^2 / 2m$, or equivalently, $m(x’)^2 / 2 + kx^2 / 2$, is therefore conserved. But this is just the total energy of the system: the first term is kinetic energy; the second is potential energy! ↩︎

  13. We can rule out the possibility of a spiral (source or sink) since the conservation equation implies that trajectories are symmetric about the $x$-axis. ↩︎

  14. As an example, a free undamped pendulum obeys the conservative equation $\theta'' + g \sin \theta / L = 0$. The critical points occur when $\theta' = 0$ (the pendulum is stationary) and $\theta = 0$ (it is hanging straight down) or $\theta = \pi$ (it is balanced upside down). The former is evidently a stable centre ($g \cos 0 / L > 0$); the latter an unstable saddle point ($g \cos \pi / L < 0$). ↩︎

  15. That such a polynomial exists follows from a dimension argument; uniqueness is immediate. Moreover, Euclidean division shows that the polynomials that annihilate $T$ are exactly the multiples of $\mu_T$. ↩︎

  16. Apply the main result to $S = \lambda I - T$, whose minimal polynomial satisfies $\mu_S(\lambda - x) = \mu_T(x)$. ↩︎

Previous