More On Lucene
According to Ted Leung (link courtesy of Jarno Virtanen) "Lupy is much slower than either Lucene or CLucene."
CLucene is a C++ port of Lucene. (I was thinking of Common Lisp when I mentioned clucene.) Clucene claims to be "faster than lucene as it is written in C++."
These two articles by Otis Gospodnetic (linked from the Lucene site) introduce Lucene's API and index structures:
Reading the second article above, this page came to mind. To quote a passage: "Because we have about 2 gigs of static data we need rapid access to, we use C++ code to memory-map huge files containing pointerless C structs (of flights, fares, etc), and then access these from Common Lisp using foreign data accesses." I haven't a clue if Lucene indices are usable in this manner.
Ah, Francesco Bellomi has tried so, using Java's NIO (new I/O) API, which supports memory-mapped files.
BTW, here's the cue for James Robertson's rant on "final": "Some [methods] were final in Directory, so I have used a slightly modified version of Directory.java (BTW, I wonder why so many methods in Lucene are made final...)"