I have said before that I believe theory/CS is a numbers game. During a few discussions sub vino, I was challenged on this issue. Two main points seemed to emerge:
- What matters is your best result (L∞).
- STOC/FOCS get 10-15 obvious accepts, and a heavy tail of competent papers, among which it's hard to choose. If we care about making STOC/FOCS a quality measure, why not accept just those 10-15 papers?
I have seen enough examples arguing for pessimism, mostly while I was acting as a reviewer. But since I can't actually talk about those, let me give 2 personal examples.
Predecessor. In STOC'06, Mikkel and I had our predecessor lower bound. The main claim to fame of this paper is that it was the first lower bound to separate linear and polynomial space.
When it comes to such a claim, in my opinion, it doesn't actually matter if you remember anything from the data structures class you took in college. It doesn't matter if you know that the bound is showing optimality of a very influential data structure from FOCS'75. It doesn't matter if you know this was one of the most well-studied problems in data-structure lower bounds, and people were actually betting the previous lower bound was tight.
No. Somebody comes and tells you that we've been toiling at this lower bound game for more than 20 years, and we haven't yet managed to show that a data structure with space n10 is sometimes better than one with space O(n). Now, they tell you, such a bound is finally here, and a vast array of problems from Algorithms 101 have switched from unapproachable to hopefully solvable. That should be a worthy result.
How did it feel from the inside? This was a paper we had worked on for 2 years. Just the conceptual understanding that we needed to develop felt like one of the longest journeys I have ever embarked on. And in the end, it was actually a clean result, which indeed needed conceptual developments, but not so much technical pain! Mikkel, who never ceases to amaze me with his youthful optimism, was betting on a best paper award. I was of course at a university, and had a better estimator of the community hype factor. I knew that Irit's PCP was unbeatable for an award, but I was still convinced our paper would go down with a splash.
The actual outcome? Moderately positive reviews. Nobody took notice. And we weren't even invited to the special issue, which was a genuine shock for us.
Range Counting. In STOC'07, I had a lower bound for 2D range counting. After the predecessor paper, I quickly decided this was the next big goal in the field. Range queries are very, very well-studied, play a crucial role in our understanding of data structures and geometry, and are a major bridge to worlds outside theory, such as databases (see this irrefutable argument). Armed with some appreciation of the techniques in the field, this also felt like a make-or-break problem: we were either going to solve it, or get stuck again for 20 years.
Though I like to look at cell-probe lower bounds, range counting has a life of its own. This class of problems was studied from the dawn of computer science, way before our understanding of models crystallized. People have been proving many excellent lower bounds for range queries in the so-called semigroup model: each input point is associated with weights from a semigroup; the goal is to compute the sum in a "range" (rectangle, circle, wizard's hat etc), using only the semigroup addition. In this model, we obtain a good understanding of geometry: it is enough to characterize a partial sum of a subset of points by the extreme points in a few directions. Addition only allows us to "build up" subsets, so we need to understand composability of constant-dimensional shapes, dictated by such extreme points.
While the bounds for semigroups were making very nice progress, we remained clueless about even the slightly stronger group model, where subtractions are allowed. This is the gap between understanding low-dimensional geometry and understanding computation (a very high-dimensional, unstructured space, such as the n-dimensional space of linear combinations). Proving bounds in the group model became the first lower-bound question in the surveys of Pankaj Agarwal and Jeff Erickson. Thus, the group model seemed hard enough; never mind the cell-probe model, which is significantly stronger than a model with additions and subtractions.
About a year after predecessor (I had the final critical insight one week before STOC), I managed to show an optimal lower bound for 2D range counting in the cell-probe and (of course) the group model. Again, it does not seem to me like a paper where you need to be an expert to understand that something interesting is happening. Citing from the abstract:
Proving such bounds in the group model has been regarded as an important challenge at least since [Fredman, JACM 1982] and [Chazelle, FOCS 1986].It seems to me that whenever somebody claims to address a challenge from JACM'82 and FOCS'86, the only possible answers are:
- You are lying.
- The paper is wrong.
- It seemed like an interesting problem back then, but things have changed and we no longer care. (... and I don't see this applying to range queries.)
- Yes, this is a very good paper.
Finally, the paper was not invited to the special issue. By this time, it was not something that had never crossed my mind; it was just a disappointment.
Conclusion. You should work on major open problems. If you solve them, it makes for papers which are reject-proof (well, almost). But don't count on that "one big result" plan, as you may be in for a nasty surprise. As a community, we are reliable enough to get some kind of average right, but we're not doing so well on L∞.
So when you get your first STOC/FOCS paper, go open the champagne or light up the cigar. You are officially part of the game. Now, it's time to play.
Next round: November 19.