Let
H f ( x ) = sup r > 0 1 2 r ∫ x − r x + r ∣ f ( y ) ∣ d y Hf\p{x} = \sup_{r > 0} \frac{1}{2r} \int_{x-r}^{x+r} \abs{f\p{y}} \,\diff{y} H f ( x ) = r > 0 sup 2 r 1 ∫ x − r x + r ∣ f ( y ) ∣ d y
be the Hardy-Littlewood maximal function. Observe that ∥ H f ∥ L ∞ ≤ ∥ f ∥ L ∞ \norm{Hf}_{L^\infty} \leq \norm{f}_{L^\infty} ∥ H f ∥ L ∞ ≤ ∥ f ∥ L ∞ , so by sublinearity,
m ( { H f > α } ) ≤ m ( { H f χ { ∣ f ∣ > α 2 } > α 2 } ) + m ( { H f χ { ∣ f ∣ ≤ α 2 } > α 2 } ) = m ( { H f χ { ∣ f ∣ > α 2 } > α 2 } ) . \begin{aligned}
m\p{\set{Hf > \alpha}}
&\leq m\p{\set{Hf\chi_{\set{\abs{f} > \frac{\alpha}{2}}} > \frac{\alpha}{2}}} + m\p{\set{Hf\chi_{\set{\abs{f} \leq \frac{\alpha}{2}}} > \frac{\alpha}{2}}} \\
&= m\p{\set{Hf\chi_{\set{\abs{f} > \frac{\alpha}{2}}} > \frac{\alpha}{2}}}.
\end{aligned} m ( { H f > α } ) ≤ m ( { H f χ { ∣ f ∣ > 2 α } > 2 α } ) + m ( { H f χ { ∣ f ∣ ≤ 2 α } > 2 α } ) = m ( { H f χ { ∣ f ∣ > 2 α } > 2 α } ) .
Then by the Hardy-Littlewood maximal inequality,
∥ H f ∥ L p p = ∫ 0 ∞ p α p − 1 m ( { H f > α } ) d α ≤ ∫ 0 ∞ p α p − 1 m ( { H f χ { ∣ f ∣ > α 2 } > α 2 } ) d α ≤ ∫ 0 ∞ C p α p − 2 ∫ { ∣ f ( x ) ∣ > α 2 } ∣ f ( x ) ∣ d x d α = C p ∫ R ∣ f ( x ) ∣ ∫ 0 2 ∣ f ( x ) ∣ α p − 2 d α d x ( Fubini-Tonelli ) = C 2 p − 1 p p − 1 ∫ R ∣ f ( x ) ∣ p d x = C 2 p − 1 p p − 1 ∥ f ∥ L p p . \begin{aligned}
\norm{Hf}_{L^p}^p
&= \int_0^\infty p\alpha^{p-1} m\p{\set{Hf > \alpha}} \,\diff\alpha \\
&\leq \int_0^\infty p\alpha^{p-1} m\p{\set{Hf\chi_{\set{\abs{f} > \frac{\alpha}{2}}} > \frac{\alpha}{2}}} \,\diff\alpha \\
&\leq \int_0^\infty Cp\alpha^{p-2} \int_{\set{\abs{f\p{x}} > \frac{\alpha}{2}}} \abs{f\p{x}} \,\diff{x} \,\diff\alpha \\
&= Cp \int_\R \abs{f\p{x}} \int_0^{2\abs{f\p{x}}} \alpha^{p-2} \,\diff\alpha \,\diff{x}
&& \p{\text{Fubini-Tonelli}} \\
&= \frac{C2^{p-1}p}{p - 1} \int_\R \abs{f\p{x}}^p \,\diff{x} \\
&= \frac{C2^{p-1}p}{p - 1} \norm{f}_{L^p}^p.
\end{aligned} ∥ H f ∥ L p p = ∫ 0 ∞ p α p − 1 m ( { H f > α } ) d α ≤ ∫ 0 ∞ p α p − 1 m ( { H f χ { ∣ f ∣ > 2 α } > 2 α } ) d α ≤ ∫ 0 ∞ Cp α p − 2 ∫ { ∣ f ( x ) ∣ > 2 α } ∣ f ( x ) ∣ d x d α = Cp ∫ R ∣ f ( x ) ∣ ∫ 0 2 ∣ f ( x ) ∣ α p − 2 d α d x = p − 1 C 2 p − 1 p ∫ R ∣ f ( x ) ∣ p d x = p − 1 C 2 p − 1 p ∥ f ∥ L p p . ( Fubini-Tonelli )
for some constant C > 0 C > 0 C > 0 . We now apply it twice: observe that by Fubini-Tonelli,
∥ y ↦ ∥ f y ∥ ∥ L p ( R ) p = ∫ R ∥ f y ∥ L p p d y = ∫ R ( ∫ R ∣ f ( x , y ) ∣ p d x ) d y = ∥ f ∥ L p ( R 2 ) p , \norm{y \mapsto \norm{f_y}}_{L^p\p{\R}}^p
= \int_\R \norm{f_y}_{L^p}^p \,\diff{y}
= \int_\R \p{\int_\R \abs{f\p{x, y}}^p \,\diff{x}} \,\diff{y}
= \norm{f}_{L^p\p{\R^2}}^p, ∥ y ↦ ∥ f y ∥ ∥ L p ( R ) p = ∫ R ∥ f y ∥ L p p d y = ∫ R ( ∫ R ∣ f ( x , y ) ∣ p d x ) d y = ∥ f ∥ L p ( R 2 ) p ,
and so
∣ 1 4 r ρ ∫ − r r ∫ − ρ ρ f ( x + h , y + ℓ ) d h d ℓ ∣ ≤ 1 2 r ∫ − r r ( 1 2 ρ ∫ − ρ ρ ∣ f ( x + h , y + ℓ ) ∣ d h ) d ℓ ≤ 1 2 r ∫ − r r H f y + ℓ ( x ) d ℓ ≤ H ( H f y ( x ) ) ( y ) ⟹ ∣ [ M f ] ( x , y ) ∣ ≤ H ( H f y ( x ) ) ( y ) . \begin{aligned}
\abs{\frac{1}{4r\rho} \int_{-r}^r \int_{-\rho}^\rho f\p{x + h, y + \ell} \,\diff{h} \,\diff{\ell}}
&\leq \frac{1}{2r} \int_{-r}^r \p{\frac{1}{2\rho} \int_{-\rho}^\rho \abs{f\p{x + h, y + \ell}} \,\diff{h}} \,\diff\ell \\
&\leq \frac{1}{2r} \int_{-r}^r Hf_{y+\ell}\p{x} \,\diff\ell \\
&\leq H\p{Hf_y\p{x}}\p{y} \\
\implies
\abs{\br{Mf}\p{x, y}}
&\leq H\p{Hf_y\p{x}}\p{y}.
\end{aligned} ∣ ∣ 4 r ρ 1 ∫ − r r ∫ − ρ ρ f ( x + h , y + ℓ ) d h d ℓ ∣ ∣ ⟹ ∣ [ M f ] ( x , y ) ∣ ≤ 2 r 1 ∫ − r r ( 2 ρ 1 ∫ − ρ ρ ∣ f ( x + h , y + ℓ ) ∣ d h ) d ℓ ≤ 2 r 1 ∫ − r r H f y + ℓ ( x ) d ℓ ≤ H ( H f y ( x ) ) ( y ) ≤ H ( H f y ( x ) ) ( y ) .
Hence, if we replace C C C with C 2 p − 1 p p − 1 \frac{C2^{p-1}p}{p - 1} p − 1 C 2 p − 1 p , we have
∥ M f ∥ L p ( R 2 ) p ≤ ∥ ( x , y ) ↦ H ( H f y ( x ) ) ( y ) ∥ L p ( R 2 ) = ∥ x ↦ ∥ y ↦ H ( H f y ( x ) ) ( y ) ∥ L p ( R ) ∥ L p ( R ) ≤ C ∥ x ↦ ∥ y ↦ H f y ( x ) ∥ L p ( R ) ∥ L p ( R ) ≤ C 2 ∥ x ↦ ∥ y ↦ f y ( x ) ∥ L p ( R ) ∥ L p ( R ) = C 2 ∥ f ∥ L p ( R 2 ) . \begin{aligned}
\norm{Mf}_{L^p\p{\R^2}}^p
&\leq \norm{\p{x, y} \mapsto H\p{Hf_y\p{x}}\p{y}}_{L^p\p{\R^2}} \\
&= \norm{x \mapsto \norm{y \mapsto H\p{Hf_y\p{x}}\p{y}}_{L^p\p{\R}}}_{L^p\p{\R}} \\
&\leq C\norm{x \mapsto \norm{y \mapsto Hf_y\p{x}}_{L^p\p{\R}}}_{L^p\p{\R}} \\
&\leq C^2 \norm{x \mapsto \norm{y \mapsto f_y\p{x}}_{L^p\p{\R}}}_{L^p\p{\R}} \\
&= C^2 \norm{f}_{L^p\p{\R^2}}.
\end{aligned} ∥ M f ∥ L p ( R 2 ) p ≤ ∥ ( x , y ) ↦ H ( H f y ( x ) ) ( y ) ∥ L p ( R 2 ) = ∥ ∥ x ↦ ∥ y ↦ H ( H f y ( x ) ) ( y ) ∥ L p ( R ) ∥ ∥ L p ( R ) ≤ C ∥ ∥ x ↦ ∥ y ↦ H f y ( x ) ∥ L p ( R ) ∥ ∥ L p ( R ) ≤ C 2 ∥ ∥ x ↦ ∥ y ↦ f y ( x ) ∥ L p ( R ) ∥ ∥ L p ( R ) = C 2 ∥ f ∥ L p ( R 2 ) .
Let
[ H r f ] ( x , y ) = 1 4 r 3 ∫ − r r ∫ − r 2 r 2 ∣ f ( x + h , y + ℓ ) ∣ d h d ℓ , \br{H_rf}\p{x, y} = \frac{1}{4r^3} \int_{-r}^r \int_{-r^2}^{r^2} \abs{f\p{x + h, y + \ell}} \,\diff{h} \,\diff{\ell}, [ H r f ] ( x , y ) = 4 r 3 1 ∫ − r r ∫ − r 2 r 2 ∣ f ( x + h , y + ℓ ) ∣ d h d ℓ ,
which we showed via the previous calculation that H r H_r H r is strong type ( p , p ) \p{p, p} ( p , p ) . As in the proof of the Lebesgue differentiation theorem, let α > 0 \alpha > 0 α > 0 . First, observe that by Chebyshev's inequality,
m ( { ( x , y ) ∈ R 2 | [ H r f ] ( x , y ) > α } ) ≤ C α ∥ H r f ∥ L p ≤ C α ∥ f ∥ L p m\p{\set{\p{x, y} \in \R^2 \st \br{H_rf}\p{x, y} > \alpha}}
\leq \frac{C}{\alpha} \norm{H_rf}_{L^p}
\leq \frac{C}{\alpha} \norm{f}_{L^p} m ( { ( x , y ) ∈ R 2 ∣ ∣ [ H r f ] ( x , y ) > α } ) ≤ α C ∥ H r f ∥ L p ≤ α C ∥ f ∥ L p
for any f ∈ L p ( R ) f \in L^p\p{\R} f ∈ L p ( R ) . Notice that the result is certainly true for continuous functions, so let ε > 0 \epsilon > 0 ε > 0 and pick g g g continuous and compactly supported such that ∥ f − g ∥ L p < ε \norm{f - g}_{L^p} < \epsilon ∥ f − g ∥ L p < ε . Then
∣ [ A r f ] ( x , y ) − f ( x , y ) ∣ ≤ ∣ [ A r ( f − g ) ] ( x , y ) ∣ + ∣ [ A r g ] ( x , y ) − g ( x , y ) ∣ + ∣ f ( x , y ) − g ( x , y ) ∣ ⟹ lim sup r → 0 ∣ [ A r f ] ( x , y ) − f ( x , y ) ∣ ≤ lim sup r → 0 ∣ [ A r ( f − g ) ] ( x , y ) ∣ + ∣ f ( x , y ) − g ( x , y ) ∣ . \begin{gathered}
\abs{\br{A_rf}\p{x, y} - f\p{x, y}}
\leq \abs{\br{A_r\p{f - g}}\p{x, y}} + \abs{\br{A_rg}\p{x, y} - g\p{x, y}} + \abs{f\p{x, y} - g\p{x, y}} \\
\implies
\limsup_{r\to0} \,\abs{\br{A_rf}\p{x, y} - f\p{x, y}}
\leq \limsup_{r\to0} \,\abs{\br{A_r\p{f - g}}\p{x, y}} + \abs{f\p{x, y} - g\p{x, y}}.
\end{gathered} ∣ [ A r f ] ( x , y ) − f ( x , y ) ∣ ≤ ∣ [ A r ( f − g ) ] ( x , y ) ∣ + ∣ [ A r g ] ( x , y ) − g ( x , y ) ∣ + ∣ f ( x , y ) − g ( x , y ) ∣ ⟹ r → 0 lim sup ∣ [ A r f ] ( x , y ) − f ( x , y ) ∣ ≤ r → 0 lim sup ∣ [ A r ( f − g ) ] ( x , y ) ∣ + ∣ f ( x , y ) − g ( x , y ) ∣ .
Hence,
m ( { lim sup r → 0 ∣ [ A r f ] ( x , y ) − f ( x , y ) ∣ > α } ) ≤ m ( { lim sup r → 0 ∣ [ A r ( f − g ) ] ( x , y ) ∣ > α 2 } ) + m ( { ∣ f ( x , y ) − g ( x , y ) ∣ > α 2 } ) ≤ m ( { [ H r ( f − g ) ( x , y ) > α 2 ] } ) + m ( { ∣ f ( x , y ) − g ( x , y ) ∣ > α 2 } ) ≤ 2 C α ∥ f − g ∥ L p + 2 α ∥ f − g ∥ L p ≤ ( 2 C + 2 α ) ε . \begin{aligned}
m\p{\set{\limsup_{r\to0} \,\abs{\br{A_rf}\p{x, y} - f\p{x, y}} > \alpha}}
&\leq m\p{\set{\limsup_{r\to0} \,\abs{\br{A_r\p{f - g}}\p{x, y}} > \frac{\alpha}{2}}} + m\p{\set{\abs{f\p{x, y} - g\p{x, y}} > \frac{\alpha}{2}}} \\
&\leq m\p{\set{\br{H_r\p{f - g}\p{x, y} > \frac{\alpha}{2}}}} + m\p{\set{\abs{f\p{x, y} - g\p{x, y}} > \frac{\alpha}{2}}} \\
&\leq \frac{2C}{\alpha} \norm{f - g}_{L^p} + \frac{2}{\alpha} \norm{f - g}_{L^p} \\
&\leq \p{\frac{2C + 2}{\alpha}} \epsilon.
\end{aligned} m ( { r → 0 lim sup ∣ [ A r f ] ( x , y ) − f ( x , y ) ∣ > α } ) ≤ m ( { r → 0 lim sup ∣ [ A r ( f − g ) ] ( x , y ) ∣ > 2 α } ) + m ( { ∣ f ( x , y ) − g ( x , y ) ∣ > 2 α } ) ≤ m ( { [ H r ( f − g ) ( x , y ) > 2 α ] } ) + m ( { ∣ f ( x , y ) − g ( x , y ) ∣ > 2 α } ) ≤ α 2 C ∥ f − g ∥ L p + α 2 ∥ f − g ∥ L p ≤ ( α 2 C + 2 ) ε .
Thus, the set where the m ( { lim sup r → 0 ∣ [ A r f ] ( x , y ) − f ( x , y ) ∣ > α } ) = 0 m\p{\set{\limsup_{r\to0} \,\abs{\br{A_rf}\p{x, y} - f\p{x, y}} > \alpha}} = 0 m ( { lim sup r → 0 ∣ [ A r f ] ( x , y ) − f ( x , y ) ∣ > α } ) = 0 for all α > 0 \alpha > 0 α > 0 , so [ A r f ] → f \br{A_rf} \to f [ A r f ] → f for almost every ( x , y ) ∈ R 2 \p{x, y} \in \R^2 ( x , y ) ∈ R 2 .