
\documentclass{article}
\usepackage{amsmath,amssymb}
\setlength{\oddsidemargin}{0.25 in}
\setlength{\evensidemargin}{-0.25 in}
\setlength{\topmargin}{-0.6 in}
\setlength{\textwidth}{6.5 in}
\setlength{\textheight}{8.5 in}
\setlength{\headsep}{0.75 in}
\setlength{\parindent}{0 in}
\setlength{\parskip}{0.1 in}
\newcounter{lecnum}
%
% The following macro is used to generate the header.
%
\newcommand{\lecture}[4]{
  \pagestyle{myheadings}
  \thispagestyle{plain}
  \newpage
  \setcounter{lecnum}{#1}
  \setcounter{page}{1}
  \noindent
  \begin{center}
    \framebox{
      \vbox{\vspace{2mm}
        \hbox to 6.28in { {{\bf 18.317~Combinatorics, Probability, and
  Computations on Groups} \hfill #2} }
        \vspace{4mm}
        \hbox to 6.28in { {\Large \hfill Lecture #1  \hfill} }
        \vspace{2mm}
        \hbox to 6.28in { {\it Lecturer: #3 \hfill Scribe: #4} }
        \vspace{2mm}}}
  \end{center}
  \markboth{Lecture #1: #2}{Lecture #1: #2}
  \vspace*{4mm}
}

\input{epsf}
%usage: \fig{LABEL}{FIGURE-HEIGHT}{CAPTION}{FILENAME}
\newcommand{\fig}[4]{
        \begin{figure}
        \setlength{\epsfysize}{#2}
        \centerline{\epsfbox{#4}}
        \caption{#3} \label{#1}
        \end{figure}
        }

\newtheorem{theorem}{Theorem}
\newtheorem{lemma}[theorem]{Lemma}
\newtheorem{proposition}[theorem]{Proposition}
\newtheorem{claim}[theorem]{Claim}
\newtheorem{corollary}[theorem]{Corollary}
\newtheorem{definition}[theorem]{Definition}
\newenvironment{proof}{{\bf Proof:}}{\hfill\rule{2mm}{2mm}}
\newenvironment{mproof}[1]{{\bf #1}}{\hfill\rule{2mm}{2mm}}

\def\E#1{{\rm E}[#1]}
\def\i{\hspace*{5mm}}
\newcounter{lineno}
\setcounter{lineno}{0}
\def\n{\addtocounter{lineno}{1} \hbox to 7mm{\hfill \arabic{lineno}:}\hspace{2mm}}
\def\resetn{\setcounter{lineno}{0}}
\def\tqbfk{\tqbf_k^\exists}
\def\ep{\epsilon}
\def\erd{Erd\H{o}s}

\begin{document}
\lecture{33}{5 December 2001}{Igor Pak}{Christopher Malon}

\section*{Blind Algorithms and Product Replacement}

Recall the Product Replacement Algorithm:
\begin{itemize}
\item Start at a generating $k$-tuple $\left< g_1, \ldots, g_k \right> = G$.
\item Run a random walk on $\Gamma_k (G)$ for $T$ steps.
\item Output a random component $g_i$ of the vertex you arrive at.
\end{itemize}

So that we know how long to take the random walk in this algorithm,
it would be helpful to know whether the mixing time of $\Gamma_k (G)$ is
polynomial in $\log \vert G \vert$.

We can make some trivial observations in response to this question:
\begin{itemize}
\item $\Gamma_k (G)$ need not even be connected, so the mixing time
could be infinite.
\item If $k > d(G) + m(G)$, then $\Gamma_k (G)$ is connected, and
its mixing time is finite.
\item The diameter of $\Gamma_k (G)$ is $O({\log}^2 \vert G \vert)$ for
$k = 2 \log \vert G \vert$.
The mixing time must be at least as big as the diameter,
but we don't know how much bigger.
\end{itemize}

We will prove:

\begin{theorem}
Given $c, c^{\prime} > 0$, there is a constant $c^{\prime \prime} > 0$
so that if
$c \log \vert G \vert \log \log \vert G \vert \leq k
\leq c^{\prime} \log \vert G \vert \log \log \vert G \vert$, then
the mixing time $\tau_4 \leq c^{\prime \prime} {\log}^{14} \vert G \vert
{\log \log}^5 \vert G \vert)$.
\label{thm:main}
\end{theorem}

\subsection*{Blind Algorithms}

Suppose $R_1, \ldots, R_k$ are reversible Markov chains
on $\{1, \ldots, n\}$, and let $\pi$ be a stationary distribution,
{\em i.e.}, $R_i \pi = \pi$ for all $i$.  (If $\pi$ is a uniform
distribution, then reversibility means that the $R_i$ are symmetric
matrices.)
Define $M = \frac{1}{k} (R_1 + \ldots + R_k)$, which is again a Markov
chain satisfying $M \pi = \pi$.

Let $\overline{a} = (a_1, \ldots)$ be a finite sequence with each
$a_i \in \{ 1, \ldots, k \}$.  Let $l(\overline{a})$ denote the
length of the sequence $\overline{a}$.
Let $\cal{A}$ be the set of all such sequences $\overline{a}$,
and $A$ be a probability distribution on $\cal{A}$.
Let $T = E_A(l(\overline{a}))$ be the expectation value of the length.
For each $\overline{a}$,
define $R_{\overline{a}} = R_{a_1} \cdots R_{a_{l(\overline{a})}}$.

\begin{definition}
$A$ defines a {\em blind algorithm} if, for all $i \in \{ 1, \ldots, n \}$,
we have $\| E_A (R_{\overline{a}} (i)) - \pi \| < \frac{1}{4}$.
\end{definition}

A special case of a blind algorithm arises when we have a
labeled graph on $n$ vertices, the transition probabilities
in each $R_i$ are positive only between vertices that are joined
by an edge, and the $R_i$ are symmetric (so that the uniform distribution
is stationary with respect to all $R_i$).
If we fix a starting vertex $i$, each sequence $\overline{a}$
defines a probability distribution on the vertices of the
graph, namely, the probability distribution over the endpoints of paths
of length $l(\overline{a})$ from $i$, in which we use $R_{a_j}$
to decide where to go on the $j$th step.
If we, furthermore, impose a probability distribution $A$ on
the sequences $\overline{a}$, then we get a probability distribution
$Q_i$ on all the vertices of the graph.
To say that $A$ defines a ``blind'' algorithm means that for all $i$,
the separation distance $\| Q_i - U \| < \frac{1}{4}$.

Recall the fourth definition of mixing time for a random walk whose
probability distribution is $Q^t$ at the $t$th step (Lecture 12, October 5):
\begin{equation*}
\tau_4 = \min \{ t : \| Q^t - U \| < \frac{1}{4} \}
\end{equation*}

Note that neither $A$ nor $\overline{a}$ defines a random walk in the
usual sense, because the transition probabilities at each step
depend on more than our location in the graph.
However, $M = \frac{1}{k} (R_1 + \ldots R_k)$ does define a random walk,
and we have the following theorem.

\begin{theorem}
Let $M = \frac{1}{k} (R_1 + \ldots + R_k)$.  If $A$ defines a blind
algorithm and $T$ is the expected length of a path chosen from $\cal{A}$
via $A$, then the mixing time $\tau_4 (M) = O(T^2 k \log \frac{1}{\pi_0})$,
where $\pi_0$ is the minimum of the entries appearing in the stationary
distribution $\pi$.
\label{thm:blind}
\end{theorem}

We won't prove this theorem, but we'll apply it in a special case.

Suppose $G$ is a finite group, $S = S^{-1} = \{ s_1, \ldots, s_k \}$ is
a symmetric generating set, and $\Gamma = \Gamma (G, S)$ is the
corresponding Cayley graph.  Take the $R_i$ to be the permutation matrix
given by right multiplication $g \rightarrow g s_i$ (a deterministic
Markov chain).  Given any sequence $\overline{a} = (a_1, \ldots, a_l)$,
$R_{\overline{a}}$ sends $g \rightarrow g s_{a_1} \cdots s_{a_l}$.
For every element $g \in G$, fix a path from the identity $e$ to $g$
of minimal length.  Define a probability distribution
$A$ on $\cal{A}$ to be $\frac{1}{\vert G \vert}$ at
$\overline{a}$ if
$s_{a_1}, s_{a_1} s_{a_2}, \ldots, s_{a_1} s_{a_2} \cdots s_{a_l}$ is the
selected path from $e$ to the group element
$s_{a_1} \cdots s_{a_l}$, and zero otherwise.

In $G = {\mathbb{Z}}_n$ with $S = \{ \pm 1 \}$, there are only one or two
ways to fix these paths (the shortest decomposition of each element,
except possibly $\frac{n}{2}$, is unique).  The matrix $R_1$ corresponds
to moving left through the cycle, and $R_2$ to moving right.
The expected length $T$ of a path is $O(n)$, and $\pi_0 = \frac{1}{n}$ because
the uniform distribution is stationary under $R_1$ and $R_2$.  By Theorem
\ref{thm:blind}, the mixing time for this Cayley graph is $O(n^2 \log n)$.
This result is close to what we know ($O(n^2)$).

For any finite group $G$ with $A$ defined as above, we have
$T = E_A (l(\overline{a})) \leq d$ where $d = \mathrm{diam} (\Gamma (G, S))$.
Thus, the mixing time for a random walk on $\Gamma$ where we apply generators
$s \in S$ uniformly at random is $O(d^2 \log \vert G \vert)$.

\subsection*{A Blind Algorithm on the Product Replacement Graph}

Finally, we sketch the proof of Theorem \ref{thm:main}.
Recall that the edges in the graph $\Gamma_k (G)$
are given by $$R_{i j}^{\pm} : (g_1, \ldots, g_k) \rightarrow
(g_1, \ldots, g_i g_j^{\pm}, \ldots, g_k)$$ where $g_i g_j^{\pm}$
appears in the $i$th position.  There are $O({\vert G \vert}^k)$ vertices in
the graph, and $O(k^2)$ edges emanate from every vertex.
Theorem \ref{thm:blind} will give us a bound on the mixing time
of $\Gamma_k (G)$ if we construct a blind algorithm $A$
with respect to the $R_{i j}^{\pm}$.

Let $(g_1, \ldots, g_r)$ be a generating $r$--tuple for $G$,
where $r = O(\log \vert G \vert)$,
and consider $(g_1, \ldots, g_r, 1, \ldots, 1) \in \Gamma_k (G)$.
Instead of following a random walk on the product replacement
graph $\Gamma_k (G)$, we're going to embed Babai's algorithm for
generating uniform random group elements into an algorithm
on $\Gamma_k (G)$.  Start by setting $s = r$.
We will define the probability distribution $A$ on $\cal{A}$ as follows.
For each of the first $L$ steps,
choose $i \in \{ 1, \ldots, s \}$ and $\pm$ uniformly at random, and
apply $R_{s+1, i}^{\pm}$.
After these $L$ steps, increment $s$ and repeat.
Analysis of the Babai algorithm (November 14) shows that
if we take $L = O({\log}^3 \vert G \vert)$ and
do this $l = O(\log \vert G \vert)$ times, we should have
$g_{r+l}$ close to uniform in $G$.

By multiplying every position by a nearly uniform group element in this
manner, we can obtain a nearly uniform element of $\Gamma_k (G)$ in
$T = O( k \cdot {\log}^4 \vert G \vert)$ steps.
There are a lot of technical details to work through here, and they
weren't covered in class.

As $k \leq c^{\prime} \log \vert G \vert \log \log \vert G \vert$,
Theorem \ref{thm:blind} yields
\begin{eqnarray*}
\tau_4 & = & O(T^2 k^2 \log \frac{1}{\frac{1}{{\vert G \vert}^k}}) \\
& = & O(k^5 {(\log \vert G \vert)}^9) \\
& = & O({\log}^{14} \vert G \vert {\log \log}^5 \vert G \vert)
\end{eqnarray*}
as desired.

\end{document}
