Sorted by irrelevance

August 21st, 2008  |  Tags: , , ,

The ACM Digital Library should be useful. It is one of the largest single-site repositories of academic writing about computer science and holds almost every paper published within the last thirty years that bears an ACM copyright. Unfortunately, it does so behind an absurdly ridiculous interface.

Say you wanted to find Tom Knight’s classic 1986 paper “An architecture for mostly functional languages.” This paper appeared in an ACM conference (specifically, the ACM conference on Lisp and functional programming), so you’d be right to search the ACM DL for it. It would be reasonable to assume that searching for “knight mostly functional” would give you a good chance of finding the paper quickly.

If you did this, you’d be presented with a list of results “sorted by relevance.” The most “relevant” results, it appears, are a wide range of papers from the last ten years on speculative multithreading — from conferences, journals, and unrefereed newsletters — that cite Knight’s paper. The ACM’s search algorithm identifies Knight’s actual paper as the 46th most “relevant” search result for “knight mostly functional.” This would be risible if there were tens of thousands of search results for this string. Since there are only 48, it’s completely unconscionable.

Seek and ye shall find, I guess.