Let $A \in \C^{m \times n}$. The QR factorization is a factorization of $A$ as $QR$, where $Q \in \C^{m \times m}$ is unitary and $R \in \C^{m \times n}$ is (rectangular) upper triangular.1 We will show below that such a factorization always exists by describing three different methods to compute it.
When $A$ has full column rank, we have $a_j = \sum_{i \leq j} r_{ij} q_i$ for each $j$, so $\span \, \set{a_j}_{j \leq k} \subseteq \span \, \set{q_j}_{j \leq k}$ for each $k$. As these subspaces are both $k$-dimensional, they must be equal, which also implies that the diagonal entries of $R$ are nonzero. Moreover, if $\hat{Q}$ denotes the left $m \times n$ submatrix of $Q$ and $\hat{R}$ denotes the upper $n \times n$ submatrix of $R$, we have the thin/reduced QR factorization $A = \hat{Q} \hat{R}$.
The thin QR factorization of a full column rank matrix is nearly unique in the sense that if $A = \tilde{Q} \tilde{R}$ for some $\tilde{Q} \in \C^{m \times n}$ with orthonormal columns and some upper triangular $\tilde{R} \in \C^{n \times n}$, then $\tilde{Q} = \hat{Q}D$ and $\hat{R} = D\tilde{R}$ for some diagonal matrix $D$ whose diagonal entries have unit modulus. This follows from the observation that $D := \hat{Q}^* \tilde{Q} = \hat{R} \tilde{R}^{-1} = \hat{R}^{-*} \tilde{R}^*$ must be both upper and lower triangular. Thus, if we specify a (complex) sign for each diagonal entry of $\hat{R}$, the factorization is unique.
Gram–Schmidt orthogonalization
Suppose that $(a_j)_{j \geq 1}$ is a sequence of vectors in a Hilbert space $V$. Gram–Schmidt orthogonalization defines an orthogonal sequence of vectors $(b_j)_{j \geq 1}$ in $V$ such that $\mathcal{A}_k := \span \, \set{a_j}_{j \leq k} = \mathcal{B}_k := \span \, \set{b_j}_{j \leq k}$ for each $k$. To wit, let $\proj{b} := \proj{\span{\set{b}}}$ for $b \in H$; that is, $$ \proj{b} a = \begin{cases} \frac{\inner{a}{b}}{\inner{b}{b}} b & \text{if $b \neq 0$}, \\ b & \text{if $b = 0$}. \end{cases} $$ We then inductively define $$ b_j := a_j - \sum_{i < j} \proj{b_i} a_j. $$ Assuming that $\set{b_j}_{j < k}$ is orthogonal for a given $k$, we then have $\inner{b_k}{b_j} = \inner{a_k - \sum_{i < k} \proj{b_i} a_k}{b_j} = \inner{a_k - \proj{b_j} a_k}{b_j} = 0$ for all $j < k$, which shows that $\set{b_j}_{j \leq k}$ is orthogonal. Moreover, if $\mathcal{A}_{k-1} = \mathcal{B}_{k-1}$, then $b_k \in a_k - \mathcal{B}_{k-1} = a_k - \mathcal{A}_{k-1} \subseteq \mathcal{A}_k$ and $a_k \in b_k + \mathcal{B}_{k-1} \subseteq \mathcal{B}_k$, so $\mathcal{A}_k = \mathcal{B}_k$.
To compute a QR factorization of $A$, we can apply Gram–Schmidt orthogonalization to the columns of $A =: \begin{bmatrix} a_1 & \cdots & a_n \end{bmatrix}$ as follows. For each $j \leq m$, we inductively define $b_j := a_j - \sum_{i < j} \proj{q_i} a_j$ if $j \leq n$ and the right-hand expression is nonzero; otherwise, we select an arbitrary nonzero $b_j \in \mathcal{B}_{j-1}^\perp$. In either case, we then define $q_j := \frac{b_j}{\norm{b_j}}$. We thereby obtain an orthonormal basis $\set{q_j}_{j \leq m}$ of $\C^m$ such that $a_j = \sum_{i \leq \min \set{j,\,m}} r_{ij} q_i$ for some $r_{ij} \in \C$, as required.
Modified Gram–Schmidt orthogonalization
In Gram–Schmidt orthogonalization, we define $b_j = (I - \sum_{i < j} \proj{b_i}) a_j$. Since the $b_i$ are orthogonal, this can equivalently be written as $b_j = (I - \proj{b_{j-1}}) \cdots (I - \proj{b_2}) (I - \proj{b_1}) a_j$, so computationally speaking, the projection operator $I - \proj{b_i}$ can be applied to all $a_j$ with $i < j$ (assuming there are finitely many of them) as soon as $b_i$ is generated. The resulting algorithm is known as modified Gram–Schmidt orthogonalization and exhibits greater numerical stability than “classical” Gram–Schmidt orthogonalization.
Householder reflections
Suppose that $v$ is a nonzero vector in a Hilbert space $V$. The reflection operator across the hyperplane $\set{v}^\perp$ is defined for $x \in H$ by $$ \refl{v} x := (I - 2\proj{v}) \, x = x - \frac{2\inner{x}{v}}{\inner{v}{v}} v. $$ Since $\proj{v}$ is idempotent and self-adjoint, $\refl{v}$ is involutory and self-adjoint and therefore unitary.
A Householder reflection is a reflection operator $H : \C^d \to \C^d$ that zeroes out all components of some vector $x$ except for its first component $x_1$; we assume that the other components are not already all zeroes. In other words, $Hx = \alpha e_1$ for some $\alpha \in \C$, where $e_1 := \begin{bmatrix} 1 & 0 & \cdots & 0 \end{bmatrix}^\tp$ and $x \notin \span \, \set{e_1}$.
As $H$ is unitary and self-adjoint, we must have $\abs{\alpha} = \norm{x}$ and $\inner{Hx}{x} = \alpha \conj{x_1} \in \R$, which implies that $\alpha = \pm \sign(x_1) \norm{x}$ (unless $x_1 = 0$, in which case the only constraint is $\abs{\alpha} = \norm{x}$). Since $\refl{w} x = \alpha e_1$ if and only if $\frac{2\inner{x}{w}}{\inner{w}{w}} w = x - \alpha e_1$, using the Householder vector $v := x - \alpha e_1$ guarantees that $H := \refl{v}$ satisfies $Hx = \alpha e_1$. A conventional choice of $\alpha$ in this context is $\alpha = -\sign(x_1) \norm{x}$ so as to maximize $\norm{v}^2 = 2(\norm{x}^2 \mp \abs{x_1} \norm{x})$ for the sake of numerical stability.
To compute a QR factorization of $A$, we can apply Householder reflections successively to introduce zeroes below the diagonal in each column of $A$. More precisely, we can find a Householder reflection $H \in \C^{m \times m}$ such that $$ HA = \begin{bmatrix} \alpha & b^\tp \\ & A' \end{bmatrix}, $$ where $\alpha \in \C$, $b \in \C^{n-1}$, and $A’ \in \C^{(m-1) \times (n-1)}$ (allowing $H = I$ if the subdiagonal entries in the first column of $A$ are already zero). Now supposing inductively that $A’$ has a QR factorization $Q’ R’$, we obtain the factorization $$ A = \underbrace{H^* \begin{bmatrix} 1 & \\ & Q' \end{bmatrix}}_{Q} \underbrace{\begin{bmatrix} \alpha & b^\tp \\ & R' \end{bmatrix}}_{R}. $$
Givens rotations
Given $a, b \in \C$, consider the problem of finding a $U \in \mathrm{SU}(2)$ and an $r \in \C$ such that $U \begin{bmatrix} a \\ b \end{bmatrix} = \begin{bmatrix} r \\ 0 \end{bmatrix}$. We have $$ U = \begin{bmatrix} c & s \\ -\conj{s} & \conj{c} \end{bmatrix}, \quad \text{where $\abs{c}^2 + \abs{s}^2 = 1$} $$ and $ac + bs = r$, $b\conj{c} - a\conj{s} = 0$. Since $U$ is unitary, we must have $r = \omega \sqrt{\abs{a}^2 + \abs{b}^2}$ for some $\omega \in \C$ with $\abs{\omega} = 1$, and assuming that $r \neq 0$ (which is to say that $a$ and $b$ are not both zero), we obtain $$ c = \frac{\conj{a}}{\conj{r} \vphantom{\sqrt{\abs{a}^2 + \abs{b}^2}}} = \frac{\omega \conj{a}}{\sqrt{\abs{a}^2 + \abs{b}^2}}, \quad s = \frac{\conj{b}}{\conj{r} \vphantom{\sqrt{\abs{a}^2 + \abs{b}^2}}} = \frac{\omega \conj{b}}{\sqrt{\abs{a}^2 + \abs{b}^2}}. $$ A conventional choice in this context is $\omega = \sign(a)$, along with $U = I$ (and $r = 0$) in the case $a = b = 0$.
Thus, if $a$ and $b$ are the $i$th and $j$th components of some $x \in \C^m$, where $i < j$, the Givens rotation $$ G := \begin{bmatrix} I_{i-1} \\ & c & & s \\ & & I_{(j-1)-i} & \\ & -\conj{s} & & \conj{c} \\ & & & & I_{m-j} \end{bmatrix} $$ is a unitary matrix such that the $j$th component of $Gx$ is zero. (In the real-valued setting, $G$ is indeed a rotation in the $x_i$- $x_j$ plane.) Such rotations can evidently be applied to compute a QR factorization of $A$ by introducing zeroes below the diagonal of $A$ one at a time.
-
If $A \in \R^{m \times n}$, a QR factorization is defined analogously; i.e., with $Q$ orthogonal. ↩︎