Paul’s Canonical Tester No Comments

05  10, 2008

Paul Valiant’s Canonical Tester is a one-stop-shop solution for testing any symmetric property of a distribution. If the Canonical Tester can’t do it, neither can you. The additional upshot is that the Canonical Tester is quite simple an algorithm, so proving lower bounds (impossibility results) usually takes short 5-line arguments explaining why the Canonical Tester can’t do it.

Testing asymmetric properties is now the next interesting question. Interesting it is indeed because most practical questions actually tend to involve asymmetric properties. A modified example of Paul’s talk goes like this. Imagine Amazon makes a change on their website, and they like to know whether more books compared to electronics were sold after the change. This is an asymmetric test.

Cryo-EM structure determination No Comments

05  07, 2008

The crazy title up there means “recovering a 3D model of a molecule, using multiple 2D images/projections”. This was a talk at MIT by Yoel Shkolnisky. The talk was really neat, but the bottom line is this. From a CS point of view he was solving the following question. We have n points that (we know) come from some finite metric. We are given the pairwise distances (+ noise) between points that are within each other’s vicinity (in that metric). The goal is to reconstruct the metric. I’ve phrased this problem a little more generally then the form in which it is addressed in Yoel’s work.

The cool thing is that Yoel et al. solve this problem using a very natural appeal to graph Laplacians. This is found in a technical report on his page. Amit Singer (a collaborator) additionally has a paper on using these techniques to recover the global metric of sensor networks, based on local distance information. This is found in the form of a paper on Amit’s webpage.

There might be an opportunity for a new paper lurking in there. In particular, it is interesting to ask under what noise conditions (and noise can be correlated in real-world examples) is the recovery good. Having heard the talk, I believe that the answer to this question will also depend on the type of metric being recovered, i.e. it may depend on its doubling dimension e.g.

Oh, I should also mention that Yoel seems to have some cool papers involving the discrete Radon transform. This is probably a good place to read about it, while avoiding the continuous lingo.

Feige’s conjecture for sums with unbounded variance 2

04  22, 2008

I’d like to spell out a conjecture of Feige (which is pretty elegant and simple to state) if for no better reason so that I don’t forget it:

Let X_1,\dots,X_n be independent random non-negative variables with \E [X_i]=1 (and arbitrary variance). Then \Pr[\sum X_i < n + 1] \ge 1/e.

The conjecture is a little more general than that, but I’ve purified it for simplicity and, in fact, the above statement is the crux of the conjecture. Feige already proved the conjecture when 1/e is replaced by 1/13 using a, well, horrendous case analysis. So the goal here is really to get the constant 1/e and hopefully with a more natural argument.

This conjecture appears in On the sum of nonnegative independent random variables with unbounded variance, Feige’03. I believe that solving implies a better approximation constant for a certain sub-modular optimization problem.

Positive semidefinite operators and norm polytopes 5

04  18, 2008

I’ve been obsessing over the following conjecture. And to be honest I haven’t been able to find any counterexamples, nor make any significant progress towards proving it. So, here goes:

If A,B are positive semidefinite and A\preccurlyeq B, then \|A\|_{1\rightarrow 1} \le \|B\|_{1\rightarrow 1}.

In fact, it seems that if this is true for the \|\cdot\|_{1\rightarrow 1}-norm, it probably is true for all norms of the form \|\cdot\|_{p\rightarrow p}. Any ideas or references would be greatly appreciated.

L1 estimation, prize $10 5

03  17, 2008

I will reward a $10 prize to the first person who (dis)proves the following simple conjecture:

Let \xi\in\mathds{R}^n be uniformly distributed on the \ell_1 ball B_{\ell_1}=\{y\in\mathds{R}^n:\|y\|_1=1\}. Then, for every x\in B_{\ell_1} the variance of \xi\cdot x is independent of x, i.e. \mathbf{V}[\xi\cdot x] is a universal constant.

If you are wondering why I am posting simple conjectures for prize money — it is because I would have hated to spend the time (which I am short on) to discover they are false, while I would become immediately excited about them (and their consequences) if they are true

I will post the first correct solution as soon as it arrives, so if you are not seeing a solution below, get cracking. As a bonus for resolving this problem in the positive, I will share with you why it is important. I do require a complete formal argument!

(Prize has been awarded, see below!)

In fact, a slight variation on Alex’ argument (below) gives:

No distribution on \xi has the property that \mathbf{V}[x\cdot \xi]<\infty depends only and non-trivially on \|x\|_1.

Here’s why. Write v_1=\mathbf{V}[\xi\cdot\chi_1], v_2=\mathbf{V}[\xi\cdot 1/2(\chi_1+\chi_2)] and v_3=\mathbf{V}[\xi\cdot 1/2(\chi_1-\chi_2)]. Expand v_2 and v_3 using Formula does not parse: \mathbf{V} y =\E y^2 – (\E y)^2 and combine to get \E\xi_1\xi_2=\E\xi_1\E\xi_2. Substitute the latter back into v_2 to obtain v_2 = 1/2\cdot v_1. The latter, combined with the requirement v_1=v_2, holds only when v_1=0 or v_1=\infty thus giving a contradiction.

Vershynin on the condition number of random matrices No Comments

02  08, 2008

Vershynin gave a talk at MIT today on a beautiful recent result. Recall that the condition number of a matrix A is \kappa := \|A\|/\|A^{-1}\|. Where \|A\| is the norm of A defined as \|A\| := \sup_{x\neq 0} \|Ax\|_2 / \|x\|_2. Generalizing a long line of prior work Vershynin and Rudelson show that

For a random matrix A\in\mathds{R}^{n\times n}, \|A\|\le\sqrt{n} and \|A^{-1}\|\le \sqrt{n} with high probability (approaching 1). Here, the matrix is random in the sense that its entries are i.i.d. random variables, where the only restriction (if I recall correctly) is that the distribution of the matrix entries has a 4-th moment.

The pivotal technical theorem that leads to this result is yet again a generalization of a line of prior results, which informally goes like this

Let a_1,\dots,a_n\in\mathds{R} be given, and let \sigma_1,\dots,\sigma_n\in\{\pm1\} be randomly chosen signs. If \Pr[\sum_i \sigma_i a_i = 0] > \epsilon, then a_1,\dots,a_n can be “embedded” (with small error) in an arithmetic progression of size at most O(\epsilon^{-1}).

This theorem is the key ingredient to proving anti-concentration results – the opposite of what we usually do in Probability Theory – you want to prove that measure is spread out, not focused.

The talk was a very nice walk through the connection between a few fields: geometry, functional analysis, probability and combinatorics. What was interesting to me was how crisp-and-clear the connection between continuous objects (i.e. random real matrices) and additive combinatorics (i.e. arithmetic progression business) becomes after listening to this result.

LP decoding and WP vs. RIP No Comments

01  26, 2008

Traditionally compressed sensing has been based on linear transformations with the Restricted Isometry Property (RIP). In a recent note Kashin and Temlyakov point out that a different matrix property, the width property (WP), also suffices for LP decoding. It is also known that RIP matrices have the WP (up to a small polylog difference in the isometry inequalities). However, it is unclear whether WP suffices for recovering in the presence of noise. The details of this topic are found in my notes for Piotr Indyk’s class on Streaming Algorithms.

SODA’08 highlights No Comments

01  17, 2008

First, let me say that I didn’t attend, but from reading others’ blogs, I gathered these must-reads:

  • Fast Dimension Reduction Using Rademacher Series on Dual BCH Codes, Ailon, Liberty, SODA’08: I started reading this. It’s really nice. It improves Chazelle & Ailon’s FJLT by noticing that the worst-case is unlikely. Then it proceeds to use elegant techniques from Banach theory and coding
  • Clustering for Metric and Non-Metric distance measures, Ackermann, Blomer, Sohler, SODA’08: This was recommended by The Geomblog
  • Declaring Independence via Sketching of Sketches, Indyk, McGregor, SODA’08: This one I’ve actually read. This paper scratches the surface of a new subject of sensing statistics of matrix-like objects. The main open question (rephrased) of this paper is this:

    If a matrix A\in\mathds{R}^{n\times n} is streamed (as element-wise updates), can you approximate |A-(A\vi)(\vi^*A)|, where |M|:=\sum_{i,j}|m_{i,j}|?

    On a related note, I will mention another open question (pertaining to streaming algorithms on matrices) without motivating it much (feel free to ask me for details): Can one “take the square root of a desired unimodal origin-symmetric distribution W”. I.e. is there some magical distribution Q, so that if X~Q and Y~Q then X*Y~W? E.g. when W is taken to be some p-stable distribution, one can use its (convolutional) square root Q to approximate \ell_p-norms of the singular values vector of a streaming matrix in the rotated space. I am mentioning this question because no one seems to know the answer or have any intuition about it. Yet the question is natural and of a classical type (in Harmonic Analysis)

  • Finally, Mitzenmacher’s blog reports that Diaconis’ talk on the relationship between carry, shuffling and Young tableux is quite fascinating

At this point I open the stage for you guys to tell me about other great picks.