LP bounds for error-correcting codes No Comments

09  30, 2008

Alex Samorodnitsky talked about LP bounds for coding at MIT. A few small things I wanted to jot down follow. Delsarte was the first one to attempt an LP-type bound for coding. It sounded that his work would be interesting to read as establishing the actual bound went through some algebra and some geometry. In 1989 Kalai-Linial publish a Fourier-based approach to the analysis of Delsarte’s LP. Alex suggested that this may be the first time Fourier is used in Computer Science. The LP bound, which is essentially a packing bound for the Hamming cube, has counterparts in Mathematics regarding packings in metric spaces. Alex’s latest paper (S’08) has references to all these, but it is not available yet. So a related link is a previous paper of his and Navon’s.

Learning sensing bases and photography 2

05  27, 2008

Guillermo Sapiro has done some quite fascinating work on applications of compressed sensing to photography (and one might say to practice in general) which he spoke about at MIT last week. The tagline is this: In compressed sensing one proves that using a random basis (i.e. random sensing matrix) one can recover signals (after the measurement) as long as one knows any basis where the signal is sparse. In other words, random matrices have this universality property. In practice, however, if you sense with a random matrix the recovery quality of, say, images is somewhat horrible — images come out as if you added Gaussian noise to them. Sapiro’s work addresses this problem.

The key idea in his work is to learn a basis that is good for images. Such basis will have to give up the universality property, but this is OK. Learning such a basis is done by taking a big collection of images and trying to find a basis where all images have sparse representations that are not too far (in L2 sense) from the original images. It turns out that a relatively simple iterative algorithm, involving linear regression at each step, does the job just fine. The results are stunning! Images come out with practically perfect quality. 

··· Read more (and comment)

Paul’s Canonical Tester No Comments

05  10, 2008

Paul Valiant’s Canonical Tester is a one-stop-shop solution for testing any symmetric property of a distribution. If the Canonical Tester can’t do it, neither can you. The additional upshot is that the Canonical Tester is quite simple an algorithm, so proving lower bounds (impossibility results) usually takes short 5-line arguments explaining why the Canonical Tester can’t do it.

Testing asymmetric properties is now the next interesting question. Interesting it is indeed because most practical questions actually tend to involve asymmetric properties. A modified example of Paul’s talk goes like this. Imagine Amazon makes a change on their website, and they like to know whether more books compared to electronics were sold after the change. This is an asymmetric test.

Population algorithms No Comments

05  09, 2008

I was very pleasantly surprised today to discover that what I call (the class of) Population Algorithms has recently (this past year) been considered (not under this name) by Khandekar and Awerbuch. I am delighted to put links to their relevant papers:

Cryo-EM structure determination No Comments

05  07, 2008

The crazy title up there means “recovering a 3D model of a molecule, using multiple 2D images/projections”. This was a talk at MIT by Yoel Shkolnisky. The talk was really neat, but the bottom line is this. From a CS point of view he was solving the following question. We have n points that (we know) come from some finite metric. We are given the pairwise distances (+ noise) between points that are within each other’s vicinity (in that metric). The goal is to reconstruct the metric. I’ve phrased this problem a little more generally then the form in which it is addressed in Yoel’s work.

The cool thing is that Yoel et al. solve this problem using a very natural appeal to graph Laplacians. This is found in a technical report on his page. Amit Singer (a collaborator) additionally has a paper on using these techniques to recover the global metric of sensor networks, based on local distance information. This is found in the form of a paper on Amit’s webpage.

There might be an opportunity for a new paper lurking in there. In particular, it is interesting to ask under what noise conditions (and noise can be correlated in real-world examples) is the recovery good. Having heard the talk, I believe that the answer to this question will also depend on the type of metric being recovered, i.e. it may depend on its doubling dimension e.g.

Oh, I should also mention that Yoel seems to have some cool papers involving the discrete Radon transform. This is probably a good place to read about it, while avoiding the continuous lingo.

L1 bounds for Laplacians No Comments

05  02, 2008

Spectral Graph theory has various methods of bounding the Laplacian eigenvalues of a graph G (say by embedding another graph H into G and computing the congestion). Usually we look at the ordered eigenvalues 0=\lambda_1<\lambda_2\le \cdots \le \lambda_n, and usually we simplify things a little by assuming a bounded-degree graph, which gives us \lambda_n\le D_{\max}=O(1) for free, where D_{\max} is the maximum vertex degree. So really, it is mostly just interesting to bound \lambda_2 from below. When this is done, it is very often used as \|L^{-1}x\|_2 \le \lambda_2^{-1} \|x\|_2 (when bounding energy of electrical flows in G) or as \|(I-L)^kx\|_2\le (1-\lambda_2)^k \|x\|_2 (when bounding mixing times of random walks on G).

We are interested in a new type of Laplacian bounds. It turns out that L1 bounds of the sort \min_{x\bot\vi,\|x\|_1=1}\|Lx\|_1\ge \eta relate directly to how good electrical flow is when used for oblivious routing on G. Note that it is required here that x\bot\vi since \vi is the kernel of the Laplacian of any connected graph. This sort of bound is atypical in two ways. First, it is an L1 bound, and second, it is a lower-bound.

It is trivial to see that \|L_{K_n}x\|_1=n where L_{K_n} is the Laplacian of the n-clique. What we are after is proving that

\|L_H x\|_1\ge (\log n)^{-1} where H is an expander and \|x\|_1=1 and x\bot\vi.

This statement holds true in experiments with random bounded-degree expanders, and in particular it has natural interpretations in terms of electrical flow (which I am going to omit here). One can even give a pseudo-proof of this fact by simply observing that L_H \approx  L_{K_n} / n\log n.

I am curious if someone has come across similar types of inequalities and/or has an interesting proof idea.

Note that the above conjecture was recently proven and its proof will soon appear in a publication. I will post a link to it here when this happens. At this stage will be happy to hear of applications of this (now) theorem if you come across any.

Feige’s conjecture for sums with unbounded variance 2

04  22, 2008

I’d like to spell out a conjecture of Feige (which is pretty elegant and simple to state) if for no better reason so that I don’t forget it:

Let X_1,\dots,X_n be independent random non-negative variables with \E [X_i]=1 (and arbitrary variance). Then \Pr[\sum X_i < n + 1] \ge 1/e.

The conjecture is a little more general than that, but I’ve purified it for simplicity and, in fact, the above statement is the crux of the conjecture. Feige already proved the conjecture when 1/e is replaced by 1/13 using a, well, horrendous case analysis. So the goal here is really to get the constant 1/e and hopefully with a more natural argument.

This conjecture appears in On the sum of nonnegative independent random variables with unbounded variance, Feige’03. I believe that solving implies a better approximation constant for a certain sub-modular optimization problem.

Resistance vs. density No Comments

04  07, 2008

In a recent paper “Graph sparsification by effective resistances”, Spielman, Srivastava (STOC’08) show that one can sparsify a graph by essentially randomly getting rid of edges with very low effective resistance. Here we discuss an interesting corollary of one of the lemmas in this paper, which is not explicitly mentioned in the paper. In particular, we show a direct connection between the edge density of a graph and the smallest effective resistance.

For a graph G define \Pi=W^{1/2}BL^{-1}B^* W^{1/2}\in\mathds{R}^{E\times E}. This is all standard notation: W is the diagonal edge-weights matrix, B\in\mathds{R}^{E\times V} is the edge orientation matrix, and L is the Laplacian. (Note we have that L=B^*WB.)

The matrix \Pi has the property that \Pi_{e,e}=w_eR_e, where R_e is the effective resistance of edge e, and w_e is its weight. We’ll focus on the case when all w_e=1 for simplicity. We prove the following statement:

 

If graph G has n vertices and m edges, then there is an edge of effective resistance at most (n-1)/m
 

To see this, we use Lemma 3 in the paper which states that \Pi has eigenvalue 1 of multiplicity n-1, and eigenvalue 0 of multiplicity m-n+1. The sum of the eigenvalues thus equals n-1. On the other hand, the trace of \Pi equals \sum_e w_eR_e. Equating the two (and using w_e=1) we get \sum_e R_e = n-1. Therefore \E\cdot R_e = (n-1)/m. Thus, there must exist at least one edge with effective resistance at most (n-1)/m.

This inequality is tight in various small-graph cases that one could think of. And it is asymptotically tight in others (a long cycle with a shortcut in the middle). It is interesting to find out whether it is tight for large graphs of intermediate density (more than linear, less than quadratic).

Let’s see how this equality relates (directly) to sparsification. A close inspection of Lemma 4 in the paper asserts the following statement:

If we remove an edge e from a graph, then the Laplacian of the new and the old graph are related by (1-w_e R_e)L\preccurlyeq\tilde{L}\preccurlyeq L. This inequality is tight on both sides.

On the left equality is attained when considering the vector of potentials corresponding to a unit electric flow between the endpoints of e. On the right equality is attained by considering any vector that equals 0 over the endpoints of e.

The last statement combined with the inequality (that relates effective resistance to density) beg the question whether one can sparsify a graph by removing one edge at a time (and re-weighting the rest appropriately). If this is indeed a possibility, it would also have to be true that any sequence of removed edges e_1,\dots,e_k that leads to a best sparsification must contain the edge of smallest effective resistance in the original graph.

The questions mentioned here are important since they relate to the conjecture that all graphs have linear-size sparsifiers. It is also a possibility that a one-edge-at-a-time procedure may lead to a more practical sparsification (in the Laplacian sense) algorithm than the currently available alternatives.

L1 estimation, prize $10 5

03  17, 2008

I will reward a $10 prize to the first person who (dis)proves the following simple conjecture:

Let \xi\in\mathds{R}^n be uniformly distributed on the \ell_1 ball B_{\ell_1}=\{y\in\mathds{R}^n:\|y\|_1=1\}. Then, for every x\in B_{\ell_1} the variance of \xi\cdot x is independent of x, i.e. \mathbf{V}[\xi\cdot x] is a universal constant.

If you are wondering why I am posting simple conjectures for prize money — it is because I would have hated to spend the time (which I am short on) to discover they are false, while I would become immediately excited about them (and their consequences) if they are true

I will post the first correct solution as soon as it arrives, so if you are not seeing a solution below, get cracking. As a bonus for resolving this problem in the positive, I will share with you why it is important. I do require a complete formal argument!

(Prize has been awarded, see below!)

In fact, a slight variation on Alex’ argument (below) gives:

No distribution on \xi has the property that \mathbf{V}[x\cdot \xi]<\infty depends only and non-trivially on \|x\|_1.

Here’s why. Write v_1=\mathbf{V}[\xi\cdot\chi_1], v_2=\mathbf{V}[\xi\cdot 1/2(\chi_1+\chi_2)] and v_3=\mathbf{V}[\xi\cdot 1/2(\chi_1-\chi_2)]. Expand v_2 and v_3 using Formula does not parse: \mathbf{V} y =\E y^2 – (\E y)^2 and combine to get \E\xi_1\xi_2=\E\xi_1\E\xi_2. Substitute the latter back into v_2 to obtain v_2 = 1/2\cdot v_1. The latter, combined with the requirement v_1=v_2, holds only when v_1=0 or v_1=\infty thus giving a contradiction.

XTC: Power of expression without non-determinism No Comments

01  31, 2008

Applied programmers tend to dread the prospect of having to write parsers because the process is usually laborious, error-prone and unclean — all due to the computational complexities of parsing Context Free Grammars (CFG’s) in practice. A great new alternative has emerged in recent years, which makes design and implementation of complex expressive languages a pleasant exercise accessible to most.

Why are CFG’s a mess in practice?

The most expressive, widely-used and familiar notion of a language formalism are CFG’s. They are easy to think of (in theory), since the following example rule essentially summarizes them completely:

A → AxByC | Dz | xy

Lower- and upper-case letters denote terminals and non-terminals, respectively, while the “|” symbol denotes non-deterministic choice. Nice and simple. However, parsing CFGs is computationally expensive (quadratic time), which has lead to two (complicating) trade-offs in practice:

··· Read more (and comment)