What's New

What's new:

What was new in 2004? 2003? 2002? 2001?2000?1999?

Dec 6, 2005

Uploaded: Sample chapters for the second edition of my book “Solving mathematical problems: a personal perspective”. Due to “popular demand”, this book is now being reprinted by Oxford University Press, and should be available roughly at the time of the 2006 ICM in August.

Dec 5, 2005

Uploaded: “The dichotomy between structure and randomness, arithmetic progressions, and the primes”. This is a full-length version of a paper submitted for the 2006 ICM proceedings; due to length considerations, the actual ICM paper will be a substantially abridged version of this paper. Here I survey a class of structure theorems – that roughly speaking obtain a decomposition of an arbitrary object into a “structured” (low-complexity, self-correlated, measurable) component and a “pseudorandom” (high-complexity, discorrelated, anti-measurable) component – and how they can be used to prove results such as Szemeredi’s theorem and the Green-Tao theorem on arithmetic progressions (in dense subsets of integers, and dense subsets of primes, respectively). The structure theorems exist in many contexts (ergodic theory, Fourier analysis, graph theory, hypergraph theory, probability theory, information theory) but the point of this paper is to emphasize the common themes between these theorems even though their settings vary widely.

Nov 8, 2005

Uploaded: “Inverse Littlewood-Offord theorems and the condition number of random discrete matrices”, joint with Van Vu. Here we return to random Bernoulli matrices, but instead of trying to control the determinant or the singularity probability of these matrices, we now consider the condition number, which is basically the ratio between the largest and smallest singular value. The largest value is quite stable and is known to be comparable to n^{1/2} with very high probability, so it is only a matter of understanding the lowest singular value. This is conjectured to be of size roughly n^{-1/2} (in analogy with random Gaussian matrices), and there is a result of Rudelson showing that this value is at least n^{-3/2} with reasonable probability. The main result of this paper is a tail estimate, namely for any A there exists a B such that the lowest singular value is at least n^{-B} with probability 1 – O(n^{-A}). Equivalently, the condition number is O(n^{-B}) with probability 1 – O(n^{-A}). There are three major ingredients in the proof, which may have some independent interest. The first ingredient is a new inverse Littlewood-Offord theorem which almost completely characterizes which hyperplanes intersect the discrete unit cube {-1,1}^n with polynomial density (the coefficients of the normal vector have to lie almost completely in a generalized arithmetic progression of bounded size). This theorem, which is somewhat similar in spirit to Freiman’s inverse sumset theorem, is proven mostly by elementary methods (with a touch of Fourier analysis), the main difficulty being a technical issue of “clearing denominators” with only polynomial losses. The second ingredient, which is also new, is a discretization of progressions, which allows one to split any progression as the sum of a sparse progression (wide separation between elements) and a small progression. This is a discrete analogue of the fact that a linear map can be written as the direct sum of an injective map and a zero map. Combining these ingredients allows us to convert the smallest singular value problem to the singularity problem, which we handle by a generalization of the Kahn-Komlos-Szemeredi theorem on random Bernoulli matrices, allowing a small number of the rows to be replaced by deterministic vectors.

Nov 3, 2005

Uploaded: “The nonlinear Schr\”odinger equation with combined power-type nonlinearities”, joint with Monica Visan and Xiaoyi Zhang. Here we systematically study the global well-posedness and scattering theory for an NLS with two power nonlinearites, either of which can be focusing or defocusing, in three and higher dimensions, in the energy space H^1 and also in the pseudoconformal space Sigma, without any symmetry assumptions on the data. If the larger power was defocusing, it was already known that as long as both powers were H^1-subcritical, then one had global well-posedness, and it was also known that the single power defocusing H^1-critical equation had global well-posedness. We combine these results by extending the former to the endpoint case when the larger power is H^1-critical; this is by a perturbation argument using the H^1-critical pure power equation as the approximating equation. Next, we consider the scattering theory in H^1, and show that one has scattering in the doubly defocusing case as long as both powers lie between the L^2-critical and H^1-critical exponents inclusive. In the case that the lower power is L^2-critical, we need an additional hypothesis, namely that the pure power L^2-critical equation enjoys scattering and spacetime estimates (an “L^2-critical gopher”, if you will). This is especially delicate for the “doubly critical” case involving both the L^2-critical power and the H^1-critical power; here we need to approximate the low-frequency evolution by the L^2-critical equation and then the high-frequency evolution by the H^1-critical equation, keeping the medium frequencies under control via a Morawetz inequality. We also exploit the virial identity to force blowup in some negative-energy, inward-momentum cases, and the pseudoconformal identity to obtain scattering in Sigma in other cases.
Uploaded: “Entropy sum set estimates”, a deleted chapter from my upcoming book with Van Vu. Here we take the Plunnecke-Ruzsa theory of sumset estimates, but replace finite sets by random variables, and cardinality by entropy. The graph-theoretical Plunnecke inequalities no longer seem to hold in this setting, but almost all the remaining theory works, and is in fact somewhat cleaner (all constant factors in the estimates automatically get replaced by 1, thanks to an elegant “tensor power trick” of Ruzsa). In fact the theory falls out neatly from standard Shannon entropy inequalities, most notably the submodularity inequality. The same techniques also give some of the sharpest known results (due to Katz and Tardos) on the Kakeya conjecture and the Erdos distance problem, which we also discuss here.
Uploaded: “Compressions, convex geometry, and the Freiman-Bilu theorem”, joint with Ben Green. This short paper more or less fell out of a day’s worth of discussion at MIT, much to the surprise of both of us. Here what we do is we observe that the Brunn-Minkowski theorem, combined with the standard technique of compressions (as used for instance by Bollobas and Leader) show that any suitably large subset of a d-dimensional discrete box must have doubling constant at least 2^d (minus epsilon). This, when combined with Chang’s version of Freiman’s theorem, gives a reasonable quantitative version of the Freiman-Bilu theorem. To oversimplify a little, we show that any set |A| of integers of doubling constant K can be covered by exp( C K^3 log^3 K ) translates of a progression of dimension at most log K and size at most |A|. For comparison, the original arguments of Freiman and Bilu require more like exp(exp(K^C)) translates, while Chang’s theorem uses only one translate but the progression has dimension K and size exp( C K^2 log^3 K ) |A|. We also observe a variant, that any subset of R^d which contains a d-dimensional nondegenerate parallelepiped must have doubling constant at least 2^{d/2} (which is pretty sharp, as one cannot do better than (3/2)^d).
Uploaded: “Velocity Averaging, Kinetic Formulations and Regularizing Effects in Quasilinear PDEs”, joint with Eitan Tadmor. In this project (which we started six years ago!) we revisit the averaging lemma which provides a smoothing effect for the velocity averages of (entropy) solutions to kinetic formulations of quasilinear equations. The main novelty here is an abstract approach which allows us for the first time to obtain the smoothing effect for second order (convection-diffusion or convection-elliptic) systems, going beyond the first order conservation law smoothing effects in existing versions of the lemma. The key is that for each fixed “velocity” v, the multiplier associated to v, being a combination of a real elliptic quadratic and an imaginary linear term obeys a certain “truncation property” uniformly in v which allows one to obtain a good Littlewood-Paley theory relative to that multiplier. Combined with the standard non-degeneracy condition (namely, that for any fixed frequency, the multiplier associated to v is large for most velocities v) we obtain the regularizing effect. In the case of higher order vanishing of the symbol we obtain slightly better exponents than in preceding results even for the first-order case (pure transport).

Oct 26, 2005

Uploaded: “Maximal multilinear operators”, joint with Ciprian Demeter and Christoph Thiele. In this paper we generalize a result of Michael Lacey concerning the bilinear maximal operator, to now obtain similar L^p bounds for arbitrary multilinear maximal operators, both in the context of functions on R and (via a simple transference principle) functions on a measure-preserving system, though the strength of the estimates depends greatly on the rank of the multilinear averaging (the main results are in the rank one case, the remaining ones being obtained by interpolation). We follow Lacey’s approach with some additional simplifications. Firstly we use now-standard multilinear interpolation techniques and time-frequency discretization techniques to reduce to a certain restricted weak type estimate for a model operator. Using standard tree selection algorithms, the problem reduces to that of establishing a maximal Bessel inequality for a forest of wave packets, with a certain logarithmic loss. By using a certain “good lambda” type argument one can make this logarithmic loss depend only on the tree multiplicity. By spatial localization arguments (and a nonmaximal Bessel inequality, which has some independent interest) and the Radamacher-Menshov inequality we can normalize the trees to have common spatial interval, at which point the claim follows from an inequality of Bourgain.

Sep 23, 2005

Uploaded: “New bounds for Szemeredi's Theorem, I: Progressions of length 4 in finite field geometries”, joint with Ben Green. This paper was in fact my first joint project with Ben, and our first approach to this problem eventually revolved around finding versions of Szemeredi’s theorem relative to certain quadratic varieties, which we treated as being rather pseudorandom objects (at least from a linear perspective). Then Ben observed that having a version of Szemeredi’s theorem relative to pseudorandom sets would be very useful for obtaining progressions in the primes, and so we dropped this project to study primes. Two years later, we finally returned to the problem. Actually things turned out rather well, because in the meantime Ben came up with a better proof of the main result (being a hybrid of energy incrementation and density increment arguments, rather than a pure density increment argument, inspired by some arguments of Heath-Brown and Szemeredi, and avoiding the need to relativize to quadratic varieties). Ultimately, the goal is to show that given any finite abelian group G, any subset of density at least 1/(log |G|)^c for some small absolute constant c > 0 contains nontrivial progressions of length 4 (the result for 3 is already known, for instance by Bourgain’s proof of Roth’s theorem). Note that c=1 would essentially yield the Erdos-Turan conjecture for length 4 progressions (this is unknown even for length 3 progressions, where one essentially only has c=1/2, except in the finite field geometry case where one does have c=1). In a previous paper we established 1/(log log |G|)^c by developing an inverse U^3 theorem. In this paper we establish the 1/(log |G|)^c bound in the special case of a finite field geometry, i.e. G is a vector space over a small finite field (e.g. F_5). The idea is to first combine the inverse U^3 theorem with the energy incrementation argument from our primes paper to obtain a Koopman-von Neumann structure theorem that decomposes any function into a structured component of low quadratic complexity, and an error which is quadratically uniform. Discarding the error, the task is thus to count progressions of length 4 in the structured component, which is a function depending only on a bounded number of linear and quadratic phase functions. This can be done by linear Fourier analysis (and Gauss sum computations), and eventually leads to a substantial density increment on the function if it has an anomalous number of progressions of length 4. This is then fed back into the density increment argument to obtain the 1/(log |G|)^c bound. The final computation (which is reminiscent of Fourier analysis on 2-step nilmanifolds) is a little messy, but by being more low-tech one can obtain a “cheap” bound of the form exp( - c sqrt(log log |G|) ) without much difficulty, by ignoring almost all of the quadratic structure (slicing the quadratic variety right away into linear spaces).
Updated again: “Sharp well-posedness and ill-posedness results for a quadratic non-linear Schr\"odinger equation”, joint with Ioan Bejenaru. A rather confusing typo (the weight w was reported as min(1, -tau)^10 instead of max(1,-tau)^10) has been fixed. Thanks to Hideo Takaoka for pointing out the error.

Sep 14, 2005

Updated: “Sharp well-posedness and ill-posedness results for a quadratic non-linear Schr\"odinger equation”, joint with Ioan Bejenaru. A slight error in the proof of the illposedness result has been fixed. Thanks to Sebastian Herr and Martin Hadac for pointing out the error.

Sep 12, 2005

Uploaded: “On the Multilinear Restriction and Kakeya conjectures”, joint with Jon Bennett and Tony Carbery. This marks my return to restriction and Kakeya problems after an absence of several years. Here we investigate the n-linear analogues of the restriction and Kakeya conjectures in n dimensions, where there is a surprising amount of self-similarity in the problem, and where the role of curvature completely vanishes, being replaced instead by the concept of transversality. Here, the induction on scale machinery of Bourgain and Wolff can be used to show that the multilinear Kakeya and restriction conjectures are equivalent up to epsilons, and that the Kakeya conjecture implies itself in a certain scale-inductive sense. This latter observation can be refined using the heat flow monotonicity methods that were already known to work for the Brascamp-Lieb problem, thus yielding a heat-flow proof of the multilinear Kakeya conjecture and thence the multilinear restriction conjecture, up to epsilons. Sadly, this does not resolve the linear versions of these conjectures, though it does eliminate a certain “non-plany” scenario and also has an application to a combinatorial analogue of the Kakeya problem, namely the joints problem of Sharir. We also discuss some variable coefficient and k-plane analogues that can be treated by the method. One unusual feature of the approach is the reliance of identities involving a general exponent p which are difficult to prove directly, but which we establish by first proving the identities when p is a positive integer and then extrapolating (via analytic continuation and approximation theory) to general p (in fact, we need these identities for p < 1, usually).

Aug 20, 2005

Uploaded: Sample chapters from my undergraduate real analysis textbook (very near to completion!)
Uploaded: Sample chapters from my additive combinatorics book with Van Vu (also very near to completion!)

Aug 18, 2005

Uploaded: “Progress on nonlinear wave equations”. These are some rather informal slides on progress on critical geometric nonlinear wave equations for the upcoming Cambridge conference.
Updated: “A quantitative proof of Wiener’s theorem”. As pointed out by Brian Davies, there is a very short quantitative proof of Wiener’s theorem due to Newman which is somewhat along the lines of the arguments in this note.

Aug 11, 2005

Uploaded: “Sharp well-posedness and ill-posedness results for a quadratic non-linear Schr\"odinger equation”, joint with Ioan Bejenaru. Here we revisit a simple nonlinear Schrodinger equation, namely the 1D quadratic NLS i u_t + u_xx = u^2. This equation was shown to be locally well-posed in H^s(R) for s > -3/4 by Kenig, Ponce and Vega using the X^{s,b} spaces, and it was later shown by Nakanishi, Takaoka, and Tsutsumi that this was the limit of the method (except for a Besov endpoint obtained by Muramutu and Taoka). However, an inspection of iterates suggested that it was possible to attain and even go below -3/4 by using other spaces than the X^{s,b} ones. After some trial and error, Ioan and I found a suitable (though somewhat complicated) choice of spaces which enabled us to obtain local well-posedness down to s > -1. Here, the iterates suggested that this was the sharp limit, and in fact there is an abstract argument (which works for any nonlinear equation containing a suitable “high-to-low cascade”, and may be of independent interest) that shows that illposedness occurs at s=-1 or below.
Updated: “Error Correction via Linear Programming”. Due to a merge, this article is now somewhat longer and is joint with Mark Rudelson and Roman Vershynin as well as Emmanuel Candes.
Uploaded: The short story “Quadratic reciprocity and the theta function”. This is a completely classical derivation of Gauss’s reciprocity formula via the functional equation for the Jacobi theta function. It is hardly new, and was written mostly for my own benefit, but this cute little proof seems to be not as widely known (except to number theorists) as it should be, so I’m putting it on-line anyway.

June 30, 2005

Uploaded: “Stability of energy-critical nonlinear Schr\"odinger equations in high dimensions”, joint with Monica Visan. This paper arose from Monica’s investigation of the global well-posedness theory for defocusing energy critical NLS for large energy data in high dimensions. It turns out in order to complete the global theory successfully, one had to go back and revisit the local well-posedness theory, and in particular obtain a stability result (that near-solutions to NLS can always be approximated by genuine solutions provided a certain spacetime norm is small) for the equation. This is a fairly routine application of Strichartz estimates in six and fewer dimensions, but in seven and higher dimensions the nonlinearity becomes subquadratic, and so the derivative of the nonlinearity becomes non-Lipschitz; this causes a severe difficulty in iterating in the usual H^1 Strichartz spaces (and it is likely that the solution map is not even Lipschitz in these spaces for sufficiently high dimension). However, it is possible to recover Holder continuity of the solution map, as well as a stability result with similar Holder continuous dependence on the perturbation. The trick is to iterate not in the usual Strichartz spaces (which cost a full derivative and thus sacrifice the Lipschitz nature of the nonlinearity), but instead to iterate in exotic (non-admissible) Strichartz spaces, which have the same scaling as the usual spaces but require only a fraction of a derivative to be applied to the nonlinearity. One in fact obtains Lipschitz bounds in the exotic spaces, which after further iteration then recovers estimates of Holder continuity type in the original Strichartz spaces. In particular, we obtain for the first time uniform local well-posedness for the energy-critical NLS in higher dimensions (previously, the continuous dependence on the data was either not known to be uniform, or was only known to be uniform in rougher topologies than the energy topology). As a (rather standard) consequence we also obtain scattering results (with bicontinuous wave operators) provided that an a priori spacetime bound is established on finite energy solutions. (In the defocusing energy-critical NLS, this bound, which also establishes global well-posedness and regularity for this equation, will be forthcoming in Monica’s thesis.)

June 8, 2005

Uploaded: The expository note “A trilinear maximal inequality via arithmetic combinatorics”, joint with Christoph Thiele and Ciprian Demeter. Here we use (of all things) arithmetic combinatorics in order to obtain a non-trivial estimate on a trilinear maximal estimate (reminiscent of the bilinear maximal function estimate of Michael Lacey). It only gains an epsilon over the trivial estimate, however it has the unusual feature of being completely combinatorial in nature, using no Fourier analysis. There is perhaps a hope that this type of technique can help give an estimate on the trilinear Hilbert transform, which appears to be inaccessible to a purely time-frequency approach, though we emphasize that the maximal operator here is somewhat simpler than that transform.

June 6, 2005

Created: A book page, showing the status of my various book projects. Most of it is links to other things already on my web page, but the one new thing is some sample chapters from my upcoming CBMS lecture notes on nonlinear dispersive and wave equations.

June 4, 2005

Uploaded: “The Dantzig selector: statistical estimation when $p$ is much larger than $n$”, joint with Emmanuel Candes. This continues my sequence of papers with Emmanuel exploring the applications of the l^1 minimization method, in this case to linear regression models in multivariant statistics. The setup is that there are p unknowns (represented as a vector x in R^p), and a (known) linear measurement n x p matrix A. We observe n measurements y = Ax + z, where z (in R^n) is a noise term with a known power (in particular, one has a bound on A^t z). The objective is then to obtain an estimator for x from y which has as small error (as measured in the mean square sense) as possible. The novelty here is that we treat the case p >> n, when there are many more unknowns than measurements. Of course, this problem is in general unsolvable, but with the additional assumptions that x is sparse (supported on somewhat fewer than n values) or compressible (concentrated in the span of fewer than n vectors in a fixed basis) and that A is suitably generic – or more precisely, that it obeys the “uniform uncertainty principle” introduced in our earlier papers – then we do have a very efficient estimator, which we dub the “Dantzig selector” in honour of the late George Dantzig, as it is a linear programming method. In fact the selector is very simple: we choose the estimator x^ to minimize the l^1 norm of y-Ax^ subject to the constraint that the residual A^t z is less than the expected noise level in l^infty. This estimator turns out to be a sort of nonlinear soft thresholding for x, and we obtain mean square error estimates that are with an absolute constant (256, to be precise) of the ideal mean square error. (In numerical simulations, the ratio between actual and ideal MSE is more like 20). Our arguments are deterministic and elementary, based on the deterministic geometry of UUP matrices. We also have a refined version of the Dantzig selector, which we call the “Gauss-Dantzig selector”, which compensates for the underestimation bias of the Dantzig selector and can be thought of as a nonlinear hard thresholding for x rather than a soft thresholding for x. In numerical experiments, this refined selector gives an MSE which attains the ideal MSE almost exactly; basically, it uses the ordinary Dantzig selector to locate the significant unknowns (usually it locates the exact support of the true unknowns x with near-perfect accuracy) and then uses a standard Gauss least-squares method to obtain the ideal selector based on those unknowns. The method seems to be quite robust and general; the only catch is that it requires the noise to be uniformly distributed with respect to the measurement matrix, in the sense that good l^infty control is assumed on A^t z; this is generically true up to a logarithmic loss in the number of unknowns p. (Our previous paper, with Justin Romberg, explored the case of adversarial noise, in which the only assumption on the noise is l^2 bounds. There, the ideal MSE is of course significantly worse, but a slight variant of the Dantzig selector still attains this MSE up to an absolute constant.) We also give a number of demonstrates of the method to recovery of compressible signals (images, spike trains, piecewise polynomial signals) from noisy and incomplete measurements.

May 31, 2005

Uploaded: “Finiteness bounds for Holder-Brascamp-Lieb multilinear inequalities”, joint with Jon Bennett, Tony Carbery, and Michael Christ, submitted, Math. Res. Letters. Here, we give an elementary proof of one of the results in our larger paper, namely a necessary and sufficient condition for a multilinear inequality of Holder-Brascamp-Lieb type (possibly with localization or discretization) to hold with a finite constant. As it turns out, such an inequality can be deduced purely from an iterative argument using nothing more than Holder’s inequality (as the base case), and either the Fubini-Tonelli theorem, Minkowski’s inequality or the multilinear interpolation theorem (depending on whether a critical space exists or not, and whether one of the exponents is equal to 1). This generalizes the familiar proof of Young’s inequality (or the Loomis-Whitney inequality, including its generalizations to arbitrary measure spaces and co-ordinate planes by Finner) using exactly the same tools.

May 18, 2005

Uploaded: “Obstructions to uniformity, and arithmetic patterns in the primes”, submitted, Quarterly J. Pure Appl. Math. (special issue in honour of John Coates). In this survey I discuss the approach Ben Green and I have to finding arithmetic progressions (and hopefully other patterns too) in the prime numbers and similar such sets. In particular, I discuss how we use enveloping sieves, such as those constructed from truncated divisor sums by Goldston and Yildirim, to view the primes as a set of positive relative density, as opposed to one of zero absolute density, and then how one can hope to count patterns in the primes by first quantifying the obstructions there are to uniformity (in the sense of Gowers), and then averaging out the effect of those obstructions. There is a distinction between the “hard” obstructions to uniformity coming from Fourier analysis, and more recently from the quadratic Fourier analytic techniques of Gowers (as well as the closely related work from ergodic theory from Bergelson, Host, Kra, Ziegler, and others), and the “soft” obstructions coming from the earlier work of Furstenberg, Katznelson, Weiss, and others; the situation is still not completely clarified yet but there are many intriguing interconnections between harmonic analysis, combinatorics, ergodic theory, group theory, and number theory being unearthed.

May 14, 2005

Updated: the short story “An ergodic transference theorem”. I have completely rewritten and expanded this article, using a more abstract viewpoint and also going into greater detail into the Furstenberg correspondence principle. In fact we manage to derive this principle without any appeal to the axiom of choice.

May 9, 2005

Uploaded: “Random symmetric matrices are almost surely non-singular”, joint with Kevin Costello and Van Vu, submitted, Duke Math. J.. Here we adapt an argument of Komlos to show that a random symmetric n x n matrix with entries +-1, distributed independently on the upper triangular half of the matrix (and then fixed by symmetry on the strict lower triangular component), is invertible with probability 1 – O( n^{-1/8+eps} ) for any eps > 0. The main difficulty here is that the symmetry couples different entries together, making the standard incremental approach of exposing one row of the matrix at a time rather intractable. However if one adopts Komlos’ original approach of exposing a row and column simultaneously then one can proceed. The main difficulty that needs to be overcome is that the exposing of both a row and a column changes the determinant by a random quadratic form, and one needs a quadratic Littlewood-Offord inequality to show that this form is usually non-zero. We establish such an inequality using a general Cauchy-Schwarz approach (reminiscent of Weyl’s approach to exponential sums), which may be of interest for other applications.
Uploaded: The expository note “Santalo’s inequality”. This is another “deleted scene” from my additive combinatorics book with Van Vu, which we are still in the process of editing (in particular, removing a certain amount of tangential material). Here we give the proof of Santalo’s inequality giving an upper bound for the product of a volume of a symmetric convex body and its polar. We also discuss the known lower bounds, giving in particular Kuperberg’s simple (though non-optimal) result.

May 2, 2005

Uploaded: “The Brascamp--Lieb Inequalities: Finiteness, Structure and Extremals”, joint with Jon Bennett, Tony Carbery, and Michael Christ, submitted, Inventiones Math.. In this paper we revisit the general-rank, higher-dimensional Brascamp-Lieb inequalities, which generalize several useful inequalities in analysis such as Young’s inequality, the Loomis-Whitney inequality, and Holder’s inequality. We present a new, elementary, and completely self-contained proof of these inequalities with the sharp constant using a multilinear heat flow approach (which we discovered while working on the Kakeya problem, but that’s another story, to be told in a subsequent paper), and also give necessary and sufficient conditions for the constant to be finite, and for the constant to be attainable. It turns out that there is a lot of geometric structure in these inequalities, involving quiver representations, of all things, and a factorization of Brascamp-Lieb “data” into indecomposable or simple components, as well as a normal form, turns out to be essential to understanding them. We also obtain some curious monotonicity formulae for the heat equation (as well as some other convolution operators) as a by-product. This is also my first paper which involves commutative diagrams (five of them, to be exact). Correction May 19: As pointed out to me by a co-author who shall remain anonymous, this is in fact my second paper involving commutative diagrams. The first one can be found here.

Apr 23, 2005

Uploaded: “Szemer\’edi’s regularity lemma revisited”, submitted, Contributions to Discrete Mathematics. This is an expanded and cleaned up version of my earlier expository note concerning probabilistic and information-theoretic versions of the regularity lemma. It could be useful to help understand my recent paper concerning hypergraph generalizations of the regularity lemma and applications.

Apr 8, 2005

Uploaded: “The Gaussian primes contain arbitrarily shaped constellations”, submitted, J. d’Analyse Mathematique. This is a totally reworked version of my preceding paper (the one with the impossibly long title). Here we take the argument of Ben Green and myself that establishes long arithmetic progressions in the primes, and extends it to the Gaussian integers, thus showing that the Gaussian primes contain constellations of any given shape and orientation. The main new difficulty is that Szemeredi’s theorem, which was a major input in my paper with Ben, must now be replaced by another result. The obvious candidate here is the Furstenberg-Katznelson multidimensional Szemeredi theorem. However, I do not know how to run the rest of the argument with this input, mainly because I do not understand the characteristic factor for multidimensional Szemeredi recurrence (this is in fact an active area of research in ergodic theory at present). Instead, I use the hypergraph removal lemma of Gowers and Nagle-Rodl-Schacht-Skokan (or more precisely the variant introduced in my preceding paper) as the tool used to control the “Gowers anti-uniform” component of the Gaussian primes. The remaining parts of the proof closely follow the paper with Ben, thus we develop a generalized von Neumann theorem to deal with the “Gowers uniform” component of the primes, and a “Furstenberg structure theorem” to split the primes into the Gowers anti-uniform and Gowers uniform components. A large portion of the paper is devoted to establishing a pseudorandom majorant for the Gaussian primes, which requires a Goldston-Yildirim type analysis of Gaussian truncated divisor sums. Here we encounter some new technical difficulties, arising from some correlations in the Gaussian primes that are not present in the rational setting.
Uploaded: “Error Correction via Linear Programming”, joint with Emmanuel Candes, submitted, FOCS ’05 proceedings. This is a condensed version of our linear coding paper, giving an elementary proof of the main result, namely that a system of random overdetermined measurements can correct a fixed fraction of errors perfectly using the basis pursuit method.

Mar 28, 2005

Uploaded: the short story “An ergodic transference theorem”. This is an attempt to redo the transference theorem of Ben Green and myself, which can be used (among other things), to derive long arithmetic progressions in the primes, by “soft” ergodic methods, and in particular by reworking the Furstenberg correspondence principle. Interestingly, the ergodic methods can partially achieve this, but the catch is that one requires arbitrarily many linear forms conditions, instead of finitely many. As such, one does not recover our result about long arithmetic progressions in the primes, although one does get some weaker results, such as a Szemeredi theorem for subsets of positive _relative_ density of a random set of {1,…,N} of density 1/log N. I do not know whether this need for infinitely many linear forms condition is an inherent feature of the soft ergodic method, or if it can be somehow overcome.

Mar 24, 2005

Uploaded: “A variant of the hypergraph removal lemma”, submitted, J. Combin. Thy.. This is the first half of the new version of my Gaussian constellations paper, which will now appear in two separate papers; the second paper will appear shortly. Here we reprove the hypergraph removal lemma of Gowers and Nagle-Rodl-Schacht-Skokan, with a slight extra twist (which is needed in order for a transference result in the second paper to work properly). We take an “ergodic” perspective to this lemma, viewing the problem as one of taking an arbitrary set and decomposing it into a “compact” part which is of “bounded complexity”, and a “weakly mixing part” which is pseudorandom (epsilon regular). The version of the Szemeredi regularity lemma that we need is thus similar in spirit to the Furstenberg structure theorem. As usual we need a counting lemma, which is the usual careful application of Cauchy-Schwarz. One feature of the regularity lemma (which is shared in common with a recent paper of Rodl and Schacht on the same issue) is that we do not obtain regularity on the original hypergraph directly, but instead work with another hypergraph which is a small modification of the original graph (in Rodl-Schacht, they add or remove a small number of edges; here we allow the weight function of the hypergraph to be perturbed in L^2). This allows us to obtain excellent regularity control on the new hypergraph, which greatly facilitates the counting lemma. At 23 pages, this paper is then one of the shortest elementary proofs available of the hypergraph regularity lemma and its consequences such as Szemeredi’s theorem and the Furstenberg-Katznelson theorem.

Mar 17, 2005

Uploaded: “Some analysis at UCLA”. This is a very short (10 min) presentation to incoming UCLA grad students. I try to sketch out the proof of long arithmetic progressions in the primes, but clearly with the time constraints it is only the briefest of sketches.

Mar 3, 2005

Uploaded: “Stable Signal Recovery from Incomplete and Inaccurate Measurements”, with Emmanuel Candes and Justin Romberg. This is the latest in our sequence of papers exploring the basis pursuit method (l^1 minimization) to reconstruct sparse or compressible data from a sufficiently “skew” measurement ensemble. Here we address an issue not treated in the earlier papers, namely that of robustness – what happens to the reconstruction method when non-sparse noise is introduced into the measurements (e.g. roundoff error). As it turns out, the most naïve relaxation method in the basis pursuit method – allowing the minimizer to approximate the measurements rather than match it exactly – gives a surprisingly robust algorithm, indeed the error incurred at the end is at most a constant times the best possible error one could achieve with any algorithm, assuming that the measurement matrix obeys an restricted orthonormality property which we have called the uniform uncertainty principle. Our arguments are simple and in fact give some slight improvements over our earlier results in the error-free case, for instance increasing the corruption rate we can tolerate in linear coding by a modest amount.

Mar 1, 2005

Uploaded: “An inverse theorem for the Gowers $U^3(G)$ norm”, joint with Ben Green, submitted, Proc. Edin. Math. Soc.. This paper systematically studies the quadratic (d=3) case of the Gowers uniformity norm U^d(G), which can be defined for any finite abelian group G but was introduced by Gowers for the cyclic groups G = Z/NZ. (A closely related norm U^d(X) for measure-preserving systems X was studied by Host and Kra). This norm is given by an explicit formula (an average over d-dimensional cubes), and informally measures the extent to which the phase of a function on G behaves like a polynomial of degree d-1 or less (or in other words the d’th “derivative” of the phase behaves like zero). Thus for instance the U^3(G) norm of f is large when the phase of f behaves quadratically. These norms are of fundamental importance in Gowers’ approach to obtaining arithmetic progressions in dense sets, and also in my paper with Ben establishing long arithmetic progressions in the primes. When the U^d norm are small, one can accurately compute the number of progressions of length d+1 in an associated set; thus a small U^3(G) norm is useful for counting progressions of length 4. However, it is not always easy to determine when a set has small U^d norm. In a paper of Gowers it was shown that in order for the U^3(Z_N) norm of a bounded function to be small, it suffices for that function to have no “local quadratic bias” – that on most short progressions the function does not correlate strongly with a quadratic phase function; this is the essential step in Gowers’ proof of Szemeredi’s theorem for progressions of length 4. Here we extend and globalize this analysis, showing that in order for the U^3(G) norm of a bounded function to be small, it is necessary and sufficient that the function have no “global quadratic bias” – that it does not correlate with a globally quadratic phase function. (Actually, this is an oversimplification; one also has to check a certain class of “almost global quadratic phase functions”, which are quadratic on a large Bohr set). This gives a new proof of Szemeredi’s theorem for progressions of length 4 and arbitrary groups G of order coprime to 2 and 3, with a reasonably good bound (any subset of G with density 1 / (log log |G|)^c will have a progression of length 4). In a future paper we shall also apply this inverse theorem (or more precisely a generalization of it to pseudorandom measures) to establish that the primes contain the expected number of progressions of length 4. Our main result can be viewed as presenting a complete list of obstructions in order for a bounded function to be quadratically uniform (i.e. have small U^3(G)) norm. There are some other lists of obstructions known – namely generalized quadratic phase functions, as well as 2-step nilsequences; in the second half of the paper we show how all of these lists of obstructions are related, and give some new quantitative estimates on the amount of obstruction presented by these lists.

Feb 15, 2005

Uploaded: “Decoding by Linear Programming”, joint with Emmanuel Candes, submitted. This is an application of the uniform uncertainty principle (URP) developed in our previous paper. Here we show that any ensemble of linear measurements (i.e. a matrix) which obeys the UUP (which basically asserts that any sparse set of rows in this matrix behaves like an almost orthonormal system), is a good matrix for linear coding and decoding. In particular we can use such a matrix as a linear code (but over the reals or integers, not over a finite field), and can reconstruct the plaintext from the ciphertext even if a constant fraction of entries of the ciphertext are deleted. Also this shows that the UUP implies the exact reconstruction property (ERP) also discussed in our previous papers, and in fact implies a deterministic and uniform version of this property, in that all sparse signals are reconstructed exactly. In our previous paper we showed that various standard measurement ensembles (Gaussian, Bernoulli, randomly selected Fourier) obeyed the UUP with high probability, and thus for instance Gaussian linear codes can now be decoded via a simple linear programming algorithm even if a constant fraction (we obtain 0.3% in theory, but it seems closer to 30% in practice) of the ciphertext is corrupted.

Jan 27, 2005

Temporarily withdrawn: The paper “Information theory, relative versions of the hypergraph regularity and removal lemmas, the Szemer\'edi-Furstenberg-Katznelson theorem, and prime constellations in number fields”. There are a number of minor problems with this paper, and one moderately serious one, which I need to address. Normally I would simply correct these errors and post an updated version, but these issues will require a bit of time on my part, and the current version contains a number of steps which are phrased in an awkward, misleading, slightly wrong, or (in two places) extremely wrong manner. I anticipate to replace the paper with a cleaner, corrected version in a few weeks.

Jan 19, 2005

Uploaded: “On the singularity probability of random Bernoulli matrices”, with Van Vu, submitted to J. Amer. Math. Soc.. In this paper we consider the probability that a random n x n matrix whose entries are +-1 is singular (has determinant zero). This probability is conjectured to be (1/2 + o(1))^n; a famous paper of Kahn, Komlos, and Szemeredi showed 0.999^n, which we refined in an earlier paper to 0.939^n. Here we push things a bit further, to (3/4 + o(1))^n, which seems to be the limit of the method. Basically the point is to replace the discrete unit cube (where the original problem is set) with a weighted variant on which there is some further concentration on hyperplanes; this weighted variant stops being better than the original cube at ¾, which is why the method stops. A new innovation is to use some additive combinatorics (Fourier analysis and Freiman’s theorem) to identify which hyperplanes are exceptional with respect to this concentration, and count these hyperplanes directly.
Uploaded: “Information theory, relative versions of the hypergraph regularity and removal lemmas, the Szemer\'edi-Furstenberg-Katznelson theorem, and prime constellations in number fields”, submitted to Journal d'Analyse Mathématique. As the title suggests, this paper may take a little time to explain. In my paper with Ben Green establishing arbitrarily long progressions in the primes, we started with Szemeredi’s theorem on arithmetic progressions, and used quantitative ergodic methods to then deduce a relative version of that theorem, working on a pseudorandom set. This then implied progressions in the primes. In this paper we take a somewhat stronger theorem than the Szemeredi theorem, namely the Szemeredi-Furstenberg-Katznelson theorem on constellations in multidimensional sets (or equivalently, multiple recurrence for commuting shifts on a measure-preserving system). This in turn is implied by an even stronger theorem, namely the hypergraph removal lemma of Gowers and Rodl-Skokan. This theorem in turn is deduced primarily by means of a hypergraph regularity lemma (also due to Gowers and Rodl-Skokan). What we do here is give an information-theoretic proof of the hypergraph regularity lemma, by means of the Shannon entropy inequalities. This then leads to a new proof of the hypergraph removal lemma, hence the Szemeredi-Furstenberg-Katznelson theorem, and hence of the Szemeredi theorem. Furthermore, in this setting it is almost trivial to perturb the setup from a non-relative theorem to a relative theorem, in fact this amounts to making certain random variables “almost independent” instead of genuinely independent. Because of this, we can establish a relative Szemeredi-Furstenberg-Katznelson theorem on a pseudorandom measure (and also reprove the relative Szemeredi theorem of Green and myself). By invoking the Goldston-Yildirim analysis from our previous paper (with some simplifications introduced in my short story on this topic), we can then obtain multidimensional constellations in the primes (with one annoying caveat, which is too technical to describe here). We also obtain a similar result for the Gaussian primes (the primes in the Gaussian integers); this time we do not have this technical caveat because the Gaussian almost primes are more pseudorandom than the Cartesian product of the rational almost primes. We also speculate on whether there is some ergodic theoretic or measure theoretic interpretation of these results; they seem to suggest that the presence of the shift operator is in fact not necessary for recurrence theorems!
Updated: The short story “An information-theoretic proof of Szemeredi’s regularity lemma”. We now give two information-theoretic proofs of the regularity lemma, one based on an “energy increment argument” (and closer in spirit to Szemeredi’s original proof) and one based on an “entropy increment argument” which is broadly similar but fits much better into the framework of Shannon’s entropy inequalities. This second proof served as a simplified model to the hypergraph version in the paper above, so it may well be worth reading this shorter and simpler paper first to motivate the more complicated paper above.