24. Birkhoff's ergodic theorem  PDF TEX

Measure-preserving transformations

Measure-preserving systems

  1. The integer shift system: \((\mathbb Z, \mathcal{P}(\mathbb Z), |\cdot|, S\)) with \(S:\mathbb Z\to \mathbb Z\) given by \[S(x)=x+1.\]

  2. The circle rotation system: \((\mathbb{T}, \mathcal{L}(\mathbb{T}), {\rm d}x, R_{\alpha})\) with the rotation map \(R_{\alpha}:\mathbb{T}\to \mathbb{T}\) by \(R_{\alpha}(x)= x+\alpha \pmod 1\) for \(\alpha\in\mathbb R\setminus\mathbb{Q}\).

  3. The circle-doubling system: \((\mathbb{T}, \mathcal{L}(\mathbb{T}), {\rm d}x, D_2)\) with the doubling map \(D_2:\mathbb{T}\to \mathbb{T}\) given by \(D_2(x)=2x \pmod1\).

  4. The continued fraction system: \(([0, 1), \mathcal{L}([0, 1)), \mu, T)\) with the Gauss measure \[\mu(A)=\frac{1}{\log2}\int_A\frac{{\rm d}x}{1+x},\] and continued fraction map \(T:[0, 1)\to[0, 1)\) given by \(T(0)=0\) and \[T(x)=\frac{1}{x}\pmod 1, \qquad \text{when $x\not=0$.}\]

Pointwise convergence

Proposition. Let \((T_k)_{k \in \mathbb{N}}\) be a sequence of linear operators \(T_k : L^p(X) \rightarrow L^p(X)\) on a \(\sigma\)-finite measure space \((X, B(X), \mu)\). If \[C(\alpha) = \sup\limits_{ f \in L^p(X)} \mu(\{x \in X: \sup\limits_{k \in \mathbb{N}} |T_k f(x)| > \alpha \|f\|_{L^p}\}) \xrightarrow[\alpha \rightarrow \infty]{} 0,\] then \[\mathcal L_X^p = \{f \in L^p(X): \lim\limits_{n \rightarrow \infty} T_n f(x) \text{ exists } \mu \text{-almost everywhere on } X\}\] is closed in \(L^p(X)\).

Proof. Define \[\Omega(f)(x) = \limsup\limits_{m,n \rightarrow \infty} |T_n f(x) - T_m f(x)|.\]

We have to show that \(\mu(\{x \in X: \Omega(f)(x) > 0 \}) = 0\) for all \(f \in \overline{\mathcal L_X^p}^{\|\cdot\|_{L^p}}\). It is clear that \[\Omega(f)(x) \leq 2 T_*f(x).\] Thus, \[\mu(\{x\in X: \Omega f(x) > \alpha\|f\|_{L^p(X)} \}) \leq C(\alpha/2).\] For every \(\phi \in \mathcal L_X^p\), we have that \(\lim_{n \rightarrow \infty} T_n \phi(x)\) exists \(\mu\)-a.e. on \(X\). Thus, \(\Omega(\phi)(x) = 0\) \(\mu\)-a.e. on \(X\) and \(\Omega(f - \phi)(x) = \Omega(f)(x)\). It follows that \[\mu(\{x \in X: \Omega(f)(x) > \alpha\|f-\phi\|_{L^p(X)} \}) \leq C(\alpha/2).\] Now let \(f \in \overline{\mathcal L_X^p}^{\|\cdot\|_{L^p}}\) and, for a given \(\varepsilon > 0\), pick \(\phi \in \mathcal L_X^p\) so that \(\|f-\phi\|_{L^p} \leq \varepsilon^2\). Take \(\alpha = \varepsilon^{-1}\) and note that \[\mu(\{x \in X: \Omega(f)(x) > \varepsilon \}) \leq C((2\varepsilon)^{-1}).\] Therefore, \(\mu(\{x \in X: \Omega(f)(x) > 0 \}) = 0\). $$\tag*{$\blacksquare$}$$

Strategy of proving pointwise convergence

In view of the previous proposition, pointwise convergence problems for a sequence of operators \((T_n)_{n \in \mathbb{N}}\) are reduced to a two-step procedure:

  1. In the first step, we have to show that the maximal function \(T_*f(x) = \sup\limits_{k \in \mathbb{N}} |T_k f(x)|\) satisfies \[C(\alpha) = \sup\limits_{ f \in L^p(X)} \mu(\{x \in X: \sup\limits_{k \in \mathbb{N}} |T_k f(x)| > \alpha \|f\|_{L^p}\}) \xrightarrow[\alpha \rightarrow \infty]{} 0.\] In view of the previous proposition, this ensures that the set of \(L^p(X)\) functions for which we have pointwise convergence is closed in \(L^p(X)\).

  2. In the second step, we have to find a dense class \(\mathcal D\) of functions in \(L^p(X)\) for which we have pointwise convergence. In other words, if \(\overline{\mathcal D}^{\|\cdot\|_{L^p}} = L^p(X)\) and \(\mathcal D \subseteq \mathcal L_X^p = \overline{\mathcal L_X^p}^{\|\cdot\|_{L^p}},\) then \(\mathcal L_X^p = L^p(X)\).

Combining these two steps we conclude that the pointwise convergence for \((T_nf)_{n \in \mathbb{N}}\) holds for all \(f\in L^p(X)\), since \[\mathcal L_X^p = L^p(X).\]

Baire category theorem

Baire’s category theorem

Theorem.

Assume that \((X,\rho)\) is a complete metric space. If \((U_n)_{n\in\mathbb{N}}\) is a countable family of open dense sets in \(X\), then \(\bigcap_{n\in\mathbb{N}}U_n\) is dense in \(X\).

Proof. Let \(V\) be a nonempty open set in \(X\). We show that \[ V\cap\bigcap_{n\in\mathbb{N}} U_n\ne \varnothing\]

  • Since \(U_1\) is dense, \(U_1 \cap V \ne \varnothing\). Choose an open ball \(B_1\) of diameter \(< 1\) such that \({\rm cl}(B_1) \subseteq U_1 \cap V\).

  • Since \(U_1\) is open and dense, by the same argument we get an open ball \(B_2\) of diameter \(< 1/2\) such that \({\rm cl}(B_2) \subseteq U_1 \cap B_1\).

  • Proceeding similarly, we define a sequence \((B_n)_{n\in\mathbb{N}}\) of open balls in \(X\) such that for each \(n\in\mathbb{N}\), we have

    • \({\rm diam}(B_n) < 2^{-n-1}\);

    • \({\rm cl}(B_1) \subseteq U_1 \cap V\);

    • \({\rm cl}(B_{n+1}) \subseteq U_{n} \cap B_n\).

Since \((X, \rho)\) is a complete metric space, by Cantor’s theorem \[\bigcap_{n\in\mathbb{N}} B_n = \bigcap_{n\in\mathbb{N}} {\rm cl}(B_n) =\{x\} \quad \text{ for some } \quad x\in X.\] Clearly, \[x \in V\cap\bigcap_{n\in\mathbb{N}} U_n\] showing that \(V\cap\bigcap_{n\in\mathbb{N}} U_n\ne \varnothing\) as desired. $$\tag*{$\blacksquare$}$$

Remark.

  • Baire’s theorem is a very important result in mathematics, which is often used in analysis to prove results that have existential nature.

  • The name of this theorem comes from Baire’s terminology for sets.

  • If \(X\) is a topological space, a set \(E \subseteq X\) is of the first category, according to Baire, if \(E\) is a countable union of nowhere dense sets; otherwise \(E\) is of the second category.

Dual form of Baire’s category theorem

Corollary.

Every completely metrizable space is of the second category in itself.

Proof. Let \(X\) be a completely metrizable space. Suppose \(X\) is of the first category in itself. Choose a sequence \((F_n)_{n\in\mathbb{N}}\) of closed and nowhere dense sets such that \(X=\bigcup_{n\in\mathbb{N}} F_n\). Then the sets \(U_n = F_n^c\) are dense and open, and \(\bigcap_{n\in\mathbb{N}} U_n=\varnothing\). This contradicts the Baire category theorem.$$\tag*{$\blacksquare$}$$

Theorem.

If \((F_n)_{n\in\mathbb{N}}\) is a countable family of nowhere dense sets in a complete metric space \((X,\rho)\), then \(\bigcup_{n\in\mathbb{N}}F_n\) has empty interior.

Proof. Since \({\rm int}({\rm cl}{(F_n)})=\varnothing\) for every \(n\in\mathbb{N}\) observe \[{\rm int}\Big(\bigcup_{n\in\mathbb{N}} F_n\Big)\subseteq {\rm int}\Big(\bigcup_{n\in\mathbb{N}} {\rm cl}(F_n)\Big) =\bigg({\rm cl}\Big(\bigcap_{n\in\mathbb{N}} {\rm cl}(F_n)^c\Big)\bigg)^c=X^c=\varnothing,\] by Baire’s category theorem, since each \({\rm cl}(F_n)^c\) is open and dense.$$\tag*{$\blacksquare$}$$

Banach continuity principle

Operators continuous in measure

Let \((X, \mathcal M,\mu)\) be a \(\sigma\)-finite measure space.

Definition.

Let \(T : L^p(X) \rightarrow L^p(X)\), with \(1 \leq p < \infty\), be a linear operator. We say that \(T\) is continuous in measure if for every sequence \((f_k)_{k \in \mathbb{N}}\subseteq L^p(X)\) with \(\|f_k\|_{L^p(X)} \xrightarrow[k\rightarrow \infty]{} 0\) and for every \(\varepsilon > 0\), we have \[\mu(\{x \in X: |T(f_k)(x)| > \varepsilon\}) \xrightarrow[k \rightarrow \infty]{} 0.\]

We say that a sequence of linear operators \((T_n)_{n \in \mathbb{N}}\) such that \[T_n : L^p(X) \rightarrow L^p(X) \text{ for some } 1 \leq p < \infty\] is continuous in measure, if each \(T_n\) is continuous in measure.

Remark

The assumption about continuity in measure is very mild and, in practice, can be easily verified. Indeed, in applications, we will usually work with a sequence of bounded linear operators \((T_n)_{n \in \mathbb{N}}\), which means that, for every \(n \in \mathbb{N}\), there is \(0 \leq C_n < \infty\) such that \[\|T_n f\|_{L^p(X)} \leq C_n \|f\|_{L^p(X)}\quad \text{ for all } \quad f \in L^p(X).\] Then, for every \(\varepsilon>0\), by Chebyshev’s inequality, we have \[\begin{aligned} \mu(\{x \in X: |T_n(f_k)(x)| > \varepsilon \}) &\leq \frac{1}{\varepsilon^p} \int_X |T_n(f_k)(x)|^p d\mu(x)\\ &= \frac{1}{\varepsilon^p} \|T_n(f_k)\|_{L^p}^p \leq \bigg(\frac{C_n}{\varepsilon}\bigg)^p \|f_k\|_{L^p}^p \xrightarrow[k \rightarrow \infty]{} 0. \end{aligned}\] Thus, \((T_n)_{n \in \mathbb{N}}\) is continuous in measure.

Remark

If \(T_n f(x)\) converges \(\mu\)-almost everywhere on \(X\), then the maximal function (maximal operator) \[T_*f(x) = \sup\limits_{n \in \mathbb{N}} |T_n f(x)|\] is bounded \(\mu\)-almost everywhere on \(X\).

Now we will prove a certain uniform boundedness principle that asserts that the continuity in measure of a sequence of operators and the almost everywhere finiteness of the maximal operator imply continuity at \(0\) in measure of the maximal operator.

Banach continuity principle

Theorem.

Let \((X,\mathcal M,\mu)\) be a finite measure space. Assume that:

  1. The sequence \((T_n)_{n \in \mathbb{N}}\) of linear operators \(T_n: L^p(X) \rightarrow L^p(X)\) for some \(1 \leq p < \infty\) is continuous in measure.

  2. The maximal operator \(T_*f(x) = \sup_{n \in \mathbb{N}} |T_n f(x)| < \infty\) \(\mu\)-a.e. on \(X\).

Then there exists a decreasing function \(C(\alpha)\) defined for all \(\alpha > 0\) and such that \(\lim_{\alpha \rightarrow \infty} C(\alpha) = 0\) and such that \[\mu(\{x \in X: T_*f(x) > \alpha \|f\|_{L^p(X)} \}) \leq C(\alpha) \quad \text{ for every } \quad f \in L^p(X).\]

Proof. The idea is to define \[C(\alpha) = \sup_{f \in L^p(X)} \mu(\{x \in X: T_*f(x) > \alpha \|f\|_{L^p(X)}\})\] and show that \(C(\alpha)\) is decreasing, which is clear, and \(\lim\limits_{\alpha \rightarrow \infty} C(\alpha) = 0\).

Given \(\varepsilon > 0\), consider \[F_n = \{f \in L^p(X): \mu(\{x \in X: T_*f(x) > n\}) \leq \varepsilon \}.\] Step 1: We show that \(F_n\) is closed in \(L^p(X)\) for every \(n \in \mathbb{N}\). To prove this, consider \(f \notin F_n\), so then \[\mu(\{x \in X: T_*f(x) > n\}) > \varepsilon.\] It follows that there exists \(N \in \mathbb{N}\) so that \[\mu(\{x \in X: \sup\limits_{1 \leq k \leq N} |T_k f(x)| > n \}) > \varepsilon.\] Then there exists \(\delta > 0\) such that \[\mu(\{x \in X : \sup\limits_{1 \leq k \leq N} |T_k f(x)| > n + \delta\}) > \varepsilon + \delta.\] These two choices of \(N\) and \(\delta\) are possible by the continuity of the measure \(\mu\).

Now, by the continuity in measure of the operators \(T_k\), there exists a \(\delta' > 0\) such that, for every \[g \in B(f,\delta') = \{h \in L^p(X): \|f-h\|_{L^p} < \delta'\},\] we have for \(1 \leq k \leq N\) that \[\mu(\{x \in X: |T_k(f-g)(x)| > \delta\}) < \frac{\delta}{2^k}.\] Let \(Z = \bigcup_{k=1}^N \{x \in X: |T_k(f-g)(x)| > \delta\}\). Then \(\mu(Z) \leq \delta\) and \[\{x \in X: \sup\limits_{1 \leq k \leq N} |T_k f(x)| > n+\delta\} \subseteq Z \cup \{x \in X: T_*g(x) > n\},\] which implies that \(\mu(\{x \in X: T_*g(x) > n\}) > \varepsilon\) for all \(g \in B(f,\delta')\). Therefore, \(L^p(X) \setminus F_n\) is open as desired.

Step 2: Since \(T_*f(x) = \sup\limits_{k \in \mathbb{N}} |T_k f(x)| < \infty\) \(\mu\)-a.e. and \(\mu(X) < \infty\), we have \[L^p(X) = \bigcup_{n \in \mathbb{N}} F_n,\] and we know the \(F_n\) are closed. Since \(L^p(X)\) is complete, by the Baire category theorem, there is at least one \(n \in \mathbb{N}\) with \[{\rm int}(F_n) \neq \varnothing.\] Thus, there exists \(f_0 \in F_n\) and \(\delta > 0\) so that \(f_0 + \delta g \in F_n\) for all \(g \in L^p(X)\) with \(\|g\|_{L^p} = 1\), and \[\mu(\{x \in X: T_*(f_0 + \delta g)(x) > n \}) \leq \varepsilon.\] Therefore, \[\begin{aligned} \mu(\{x \in X: T_*g(x) \geq \frac{2n}{\delta}\}) &\leq \mu(\{x \in X: T_*(f_0 + \delta g)(x) > n \})\\ &+ \mu(\{x \in X: T_*(f_0 - \delta g)(x) > n \}) \leq 2\varepsilon. \end{aligned}\]

Hence, for every \(g \in L^p(X)\), \[\mu(\{x \in X: T_*g(x) > \frac{2n}{\delta} \|g\|_{L^p} \}) \leq 2\varepsilon.\] Setting \[C(\alpha) = \sup_{f \in L^p(X)} \mu(\{x \in X: T_*f(x) > \alpha \|f\|_{L^p(X)}\}),\] we have that \(\lim\limits_{\alpha \rightarrow \infty} C(\alpha) = 0\). $$\tag*{$\blacksquare$}$$

Hopf maximal inequality

Hopf maximal inequality

Let \((X,\mathcal M,\mu)\) be a measure space and consider a linear operator \(T: L^1(X) \rightarrow L^1(X)\) such that

  1. \(f \geq 0\) \(\mu\)-almost everywhere implies \(Tf \geq 0\) \(\mu\)-almost everywhere.

  2. \(\|Tf\|_{L^1(X)} \leq \|f\|_{L^1(X)}\) for all \(f \in L^1(X)\).

  3. \(\|Tf\|_{L^\infty(X)} \leq \|f\|_{L^\infty(X)}\) for all \(f \in L^1(X) \cap L^\infty(X)\).

Remark. If \(\mu(X) < \infty\), then (1) and \(T\mathbf{1}_{{}} = \mathbf{1}_{{}}\) imply (3).

Lemma.

Let \(1 \leq p < \infty\), \(0 \leq f \in L^p(X;\mathbb{R})\), and \(\lambda > 0\). Then the following hold:

  1. \(\mu(\{x \in X: f(x) > \lambda \}) \leq \lambda^{-p}\|f\|_{L^p} < \infty.\)

  2. \((f-\lambda)^+ \in L^1(X) \cap L^p(X)\).

  3. \(Tf - \lambda \leq T(f-\lambda)^+\).

  1. This is just Chebyshev’s inequality.

  2. Let \(A=\{x\in X: f(x)>\lambda\}\), and note that \[(f - \lambda)^+ = (f-\lambda)\mathbf{1}_{{A}} = f\mathbf{1}_{{A}} - \lambda\mathbf{1}_{{A}}.\] Since \(\mu(A) < \infty\), the claim holds.

  3. Observe that \[\begin{aligned} |f - (f - \lambda)^+| = &|f \mathbf{1}_{{A}} - (f-\lambda)^+ \mathbf{1}_{{A}} + f\mathbf{1}_{{A^c}} - (f-\lambda)^+ \mathbf{1}_{{A^c}}|\\ &= |\lambda \mathbf{1}_{{A}} + f \mathbf{1}_{{A^c}}| \leq \lambda. \end{aligned}\]

    Therefore, \[Tf - T(f-\lambda)^+ \leq |T(f - (f-\lambda)^+)| \leq \|f-(f-\lambda)^+\|_{L^\infty(X)} \leq \lambda.\] This completes the proof of the Lemma.$$\tag*{$\blacksquare$}$$

Notation

For \(0 \leq f \in L^p(X)\) with \(1 \leq p < \infty\) and \(\lambda > 0\), we write \[\begin{gathered} A_*^n f = \max_{1 \leq k \leq n} A_k f,\\ A_k f = \frac{1}{k} \sum_{m = 0}^{k-1} T^m f,\\ S_k f = \sum_{m = 0}^{k-1} T^m f,\\ M_\lambda^n f = \max_{1 \leq k \leq n} (S_k f - k \lambda). \end{gathered}\] Then the set \[\begin{aligned} \qquad\qquad \{ x \in X: A_*^n f(x) > \lambda\} &= \{x \in X: M_\lambda^n f(x) > 0\}\\ &\subseteq \bigcup_{k=1}^n \{x \in X: S_k f(x) > k \lambda\} \end{aligned}\] has finite measure and \((M_\lambda^n f)^+ \in L^1(X) \cap L^p(X)\) by the previous lemma.

Hopf maximal inequality

Theorem.

Let \((X,\mathcal M,\mu)\) be a measure space and consider a linear operator \(T: L^1(X) \rightarrow L^1(X)\) such that

  1. \(f \geq 0\) \(\mu\)-almost everywhere implies \(Tf \geq 0\) \(\mu\)-almost everywhere.

  2. \(\|Tf\|_{L^1(X)} \leq \|f\|_{L^1(X)}\), \(f \in L^1(X)\).

  3. \(\|Tf\|_{L^\infty(X)} \leq \|f\|_{L^\infty(X)}\) for all \(f \in L^1(X) \cap L^\infty(X)\).

Let \(1 \leq p < \infty\) and \(0 \leq f \in L^p(X)\). Then, for each \(\lambda > 0\) and \(n \in \mathbb{N}\), we have \[\mu(\{x \in X: A_*^n f(x) > \lambda \}) \leq \frac{1}{\lambda} \int_{\{A_*^n f > \lambda \}} f d\mu.\]

Proof. Take \(k \in \{2,3,\ldots,n\}\) and note that \[\begin{aligned} S_k f - k\lambda &= f-\lambda + TS_{k-1}f - (k-1)\lambda\\ &\leq f - \lambda + T(S_{k-1}f - (k-1)\lambda)^+ \\ &\leq f - \lambda + T(M_\lambda^n f)^+, \end{aligned}\] (for \(k = 1\), \(S_kf - k\lambda = f - \lambda\)). Taking the maximum over \(k\), we obtain \[M_\lambda^n f \leq f - \lambda + T(M_\lambda^n f)^+.\] Now we see that \[\begin{aligned} \int_X (M_\lambda^n f)^+ d\mu &= \int_{\{M_\lambda^n f > 0 \}} M_\lambda^n f d\mu\\ &\leq \int_{\{M_\lambda^n f > 0 \}} (f - \lambda) d\mu + \int_X T(M_\lambda^n f)^+ d\mu \\ &\leq \int_{\{M_\lambda^n f > 0 \}} (f - \lambda) d\mu + \int_X (M_\lambda^n f)^+ d\mu. \end{aligned}\]

After subtracting from both sides, we have \[0 \leq \int_{\{M_\lambda^n f > 0 \}} (f - \lambda) d\mu,\] and this means that \[\lambda \mu(\{x \in X: M_\lambda^n f(x) > 0 \}) \leq \int_{\{M_\lambda^n f > 0 \}} f d\mu,\] and \[\mu(\{x \in X: M_\lambda^n f(x) > 0 \}) \leq \frac{1}{\lambda}\int_{\{M_\lambda^n f > 0 \}} f d\mu,\] and \[\mu(\{x \in X: A_*^n f(x) > \lambda \}) \leq \frac{1}{\lambda} \int_{\{A_*^n f > \lambda \}} f d\mu.\] This completes the proof. $$\tag*{$\blacksquare$}$$

Hopf maximal inequality for complex-valued functions

Remark. We can extend the result to the case where \(f \in L^p(X)\) is a complex-valued function and \(T\) is positive, i.e., \(|Tf| \leq T|f|\). Indeed, \(|Tf| = \alpha Tf\) for some \(\alpha \in \mathbb{C}\) with \(|\alpha| = 1\). Thus, \(|Tf| = {\rm Re}(\alpha Tf) = {\rm Re}(T(\alpha f))\). But \(Tf \in \mathbb{R}\) for every \(f: X \rightarrow \mathbb{R}\). Since \(T\) is positive and \(f = f^+ - f^-\), we have \[|Tf| = {\rm Re}(T(\alpha f)) = {\rm Re}\big(T({\rm Re}(\alpha f)) + i T({\rm Im}(\alpha f))\big) = T({\rm Re}(\alpha f)).\] Therefore, \[|Tf| = T({\rm Re}(\alpha f)) \leq T|{\rm Re}(\alpha f)| \leq T|f|\] since \(|{\rm Re}(f)| \leq |f|\). Thus, \[\mu(\{ x \in X: A_*^n |f|(x) > \lambda \} \leq \frac{1}{\lambda} \int_{\{A_*^n|f| > \lambda \}} |f| d\mu.\]

Hopf maximal inequality for \(L^p\) spaces

Corollary.

Under the assumptions of the previous theorem, we have for all \(f \in L^p(X)\) with \(1 \leq p \leq \infty\) that \[\|A_*^n f \|_{L^p(X)} \leq \frac{p}{p-1} \|f\|_{L^p(X)}.\]

Proof. First note that \(\|A_*^n f \|_{L^p(X)} \leq \|A_*^n |f| \|_{L^p(X)}\). \[\begin{aligned} \|A_*^n |f| \|_{L^p(X)}^p &= p \int_0^\infty \lambda^{p-1} \mu(\{x \in X: A_*^n |f|(x) > \lambda \}) d\lambda\\ &\leq p \int_0^\infty \lambda^{p-2} \int_{\{A_*^n |f| > \lambda \}} |f| d\mu d\lambda \\ &= p \int_X |f| \bigg( \int_0^{A_*^n |f|} \lambda^{p-2} d\lambda \bigg) d\mu\\ &= \frac{p}{p-1} \int_X |f| (A_*^n |f|)^{p-1} d\mu \leq \frac{p}{p-1} \|f\|_{L^p} \|A_*^n |f| \|_{L^p}^{p-1}. {\blacksquare} \end{aligned}\]

Pointwise ergodic theorem

Birkhoff’s ergodic theorem

Theorem.

Let \((X,\mathcal M,\mu, T)\) be a \(\sigma\)-finite measure-preserving system. For \(1\le p\le \infty\) and \(f\in L^p(X)\) define the ergodic average by \[A_Nf(x)=\frac{1}{N}\sum_{m=0}^{N-1}f(T^mx)\quad \text{ for } \quad x\in X.\] If \(1\le p< \infty\) and \(f\in L^p(X)\) then there exists a \(T\)-invariant function \(f^*\in L^p(X)\) such that \[\lim_{N\to \infty}A_Nf(x)=f^*(x) \quad \text{$\mu$-a.e. on $X$.}\] If additionally, \(0<\mu(X)<\infty\) and \(T\) is ergodic, then \[\lim_{N\to \infty}A_Nf(x)=\frac{1}{\mu(X)}\int_Xf(x)d\mu(x) \quad \text{$\mu$-a.e. on $X$.}\]

Proof. We shall use the two-step procedure to establish pointwise convergence. Let \[A_*f(x)=\sup_{N\in\mathbb N}|A_Nf(x)|\quad \text{ for } \quad x\in X.\] be the maximal function corresponding to the ergodic averages \(A_Nf\).

Step 1. By Hopf’s maximal inequality we may derive the following two maximal estimates for \(A_*f\) for any \(f\in L^p(X)\) with \(1\le p\le \infty\), i.e.

  • If \(p=1\) we have weak-type \((1,1)\) maximal inequality \[\mu(\{x\in X: A_*f(x)>\lambda\})\le \frac{1}{\lambda}\|f\|_{L^1(X)}\quad \text{ for } \quad \lambda>0.\]

  • If \(1<p \le \infty\) we have strong-type \((p,p)\) maximal inequality \[\|A_*f\|_{L^p(X)}\le \frac{p}{p-1}\|f\|_{L^p(X)}.\]

This ensures that the set of \(L^p(X)\) functions for which we have pointwise convergence for \(A_Nf\) is closed in \(L^p(X)\).

Step 2. In view of the previous step we have to find a dense class of functions \(\mathcal D\subseteq L^p(X)\) such that \(A_Nf\) converges \(\mu\)-a.e. on \(X\) for \(f\in \mathcal D\). We now distinguish two cases.

  • Suppose that \(p=2\) and we use Riesz’s decomposition. Let \[\begin{gathered} I_T=\{f\in L^2(X): f\circ T=f\},\\ J_T=\{g-g\circ T: g\in L^2(X)\cap L^{\infty}(X)\}. \end{gathered}\] Then one sees that \[\overline{I_T\oplus J_T}^{\|\cdot\|_{L^2(X)}}=L^2(X),\] which means that \(\mathcal D=I_T\oplus J_T\) is dense in \(L^2(X)\). If \(f\in I_T\), then clearly \(A_Nf=f\) \(\mu\)-a.e. on \(X\), and we have \[\lim_{N\to \infty}A_Nf(x)=f(x)\quad \text{$\mu$-a.e. on $X$.}\]

  • If \(f\in J_T\), then \(f=g\circ T-g\) and by telescoping we deduce that \[\begin{aligned} |A_Nf(x)| =&\Big|\frac{1}{N}\sum_{m=0}^{N-1}(g(T^{m+1}x)-g(T^m))\Big|\\ &=\frac{1}{N}|g(T^Nx)-g(x)|\le\frac{2}{N}\|g\|_{L^{\infty}(X)}\xrightarrow[N \rightarrow \infty]{} 0, \end{aligned}\] \(\mu\)-a.e. on \(X\). Thus we have established pointwise almost everywhere convergence of \(A_Nf\) on a dense class \(\mathcal D=I_T\oplus J_T\) in \(L^2(X)\), which combined with the maximal estimate for \(p=2\) gives pointwise almost everywhere convergence for all \(f\in L^2(X)\).

  • Suppose that \(p\neq2\). Then it suffices to observe that \(L^2(X)\cap L^p(X)\) is dense in \(L^p(X)\) and \(A_Nf\) converges pointwise almost everywhere for any \(f\in L^2(X)\cap L^p(X)\). Now combining this fact with the maximal estimates for all \(1\le p<\infty\) we deduce pointwise almost everywhere convergence of \(A_Nf\) for all \(f\in L^p(X)\) as desired.

  • We now show that the limit \[\lim_{N\to \infty}A_Nf(x)=f^*(x)\] is \(T\)-invariant. Indeed, note that \[\begin{aligned} f^*(Tx)=&\lim_{N\to \infty}A_Nf(Tx)\\ &=\lim_{N\to \infty}\bigg(\frac{N+1}{N}A_{N+1}f(x)-\frac{1}{N}f(x)\bigg)=f^*(x) \end{aligned}\] \(\mu\)-a.e. on \(X\), which shows that \(f^*\) is \(T\)-invariant.

  • Suppose now that \(0<\mu(X)<\infty\) and \(T\) is ergodic. We may assume, without loss of generality, that \(\mu(X)=1\).

  • Since \(f^*(Tx)=f^*(x)\) \(\mu\)-a.e. on \(X\), and \(T\) is ergodic, we deduce that \(f^*\) is constant \(\mu\)-a.e. on \(X\).

  • We now prove that \[f^*(x)=\int_Xf(x)d\mu(x) \quad \text{$\mu$-a.e. on $X$.}\] Indeed, if \(1<p \le \infty\) we have strong-type \((p,p)\) maximal inequality \[\|A_*f\|_{L^p(X)}\le \frac{p}{p-1}\|f\|_{L^p(X)},\] then we can use the (DCT) to deduce that \[f^*(x)=\int_Xf^*(x)d\mu(x)=\lim_{N\to\infty}\int_XA_Nf(x)d\mu(x)=\int_Xf(x)d\mu(x)\] \(\mu\)-a.e. on \(X\), since \(f^*\) is constant \(\mu\)-a.e., and \(\mu\) is \(T\)-invariant.

  • For \(p=1\) we have to be more careful.

  • For \(p=1\) we have to be more careful.

  • Let \(\varepsilon>0\) and split the function \(f=g+h\), where \(g\in L^2(X)\) and \(h\in L^1(X)\) with \[\|h\|_{L^1(X)}<\varepsilon.\] Then \[\begin{aligned} \bigg|f^*(x)-\int_Xf(t)\mu(dt)\bigg|&=\bigg|h^*(x)-\int_Xh(t)\mu(dt)\bigg|\\ &\le2\|h\|_{L^1(X)}<2\varepsilon. \end{aligned}\] This completes the proof of Birkhoff’s theorem. $$\tag*{$\blacksquare$}$$

Remark. It \(T:X\to X\) is ergodic on an infinite \(\sigma\)-finite measure space \((X, \mathcal M, \mu)\), then \[\lim_{N\to \infty}A_Nf(x)=0 \quad \text{$\mu$-a.e. on $X$.}\]

Equidistribution and Weyl’s criterion

A sequence \((a_k)_{k\in\mathbb{N}}\subseteq[0, 1]\) is called equidistributed if for every continuous function \(f:[0, 1]\to \mathbb C\) we have that \[\begin{aligned} \lim_{N \to \infty} \frac{1}{N} \sum_{k=1}^{N} f (a_k) = \int_{0}^1 f(x)d x. \end{aligned}\]

Theorem.

The following statements are equivalent:

  • The sequence \((a_k)_{k\in\mathbb{N}}\subseteq[0, 1]\) is equidistributed.

  • For every \(m \in \mathbb{Z}\setminus \{0\}\) we have \[\lim_{N \to \infty} \frac{1}{N} \sum_{k=1}^{N} e^{2 \pi i m a_k}= 0.\]

  • For any \([a, b]\subset [0, 1)\) we have

    \[\lim_{N \to \infty} \frac{\#\{1\le n\le N \colon a_n \in [a, b] \}}{N} = b-a.\]

Classical equidistribution problems

  • Around 1910, Weyl (also independently Bohl and Sierpiński) proved that for every irrational \(\theta \in {\mathbb R}\), any \([a, b]\subset[0, 1)\) one has \[\begin{aligned} \lim_{N\to\infty} \frac{\# \{ 1\le n\le N: \{\theta n\} \in [a, b] \}}{N} = b-a. \end{aligned}\]

  • In 1916, Weyl showed that \((\{P(n)\})_{n\in\mathbb{N}}\) is equidistributed for every polynomial \(P:\mathbb{R}\to \mathbb{R}\) having at least one irrational coefficient.

  • In 1933 Khinchin had the great insight to see how to generalize the classical equidistribution result by using Birkhoff’s ergodic theorem and proved that for any irrational \(\theta \in {\mathbb R}\), for any Lebesgue measurable set \(E\subseteq [0,1)\), and for almost every \(x\in {\mathbb R}\), one has \[\begin{aligned} \lim_{N\to\infty} \frac{\# \{ 1\le n\le N: \{x + \theta n\} \in E \}}{N} = |E|, \end{aligned}\]

Consequences of the Birkhoff ergodic theorem

  • (Borel’s Theorem on Normal Numbers). Almost all numbers in \([0,1)\) are normal to base \(2\), i.e. for a.e. \(x\in [0,1)\) the frequency of \(1\)’s in the binary expansion of \(x\) is \(1/2\).

  • (Frequency of the natural number \(k\) in the partial quotients). For almost every real number \(x \in (0, 1)\), the digit \(k\) appears in the continued fraction expansion \(x = [a_1, a_2, \ldots]\) with density \[(\log 2)^{-1}\log\bigg(1+\frac{1}{k^2+2k}\bigg).\]

  • (Strong law of large numbers). If \(X_1, X_2,\ldots\) is an infinite sequence of i.i.d. integrable random variables with mean \(\mu\), then \[\lim_{N\to\infty}\frac{1}{N}(X_1+\ldots+X_N)=\mu.\]

  • (Kac theorem). Let \((X, \mathcal{M}, \mu, T)\) be a probability measure-preserving system \(\mu(X)=1\) and assume that \(T\) is ergodic. Then for any \(A\in \mathcal M\) with \(\mu(A)>0\) the expected return time to \(A\) is \(\mu(A)^{-1}\), equivalently \(\int_{A}\inf\{n\in\mathbb{N}: T^n(x)\in A\}d\mu(x)=1.\)

Top