This is not a Lucene issue, but I want to open this so future google
diggers can more easily find it.
There's this nasty bug in Sun's JRE:
The gist seems to be, if you try to read a large (eg 200 MB) number of
bytes during a single RandomAccessFile.read call, you can incorrectly
hit OOM. Lucene does this, with norms, since we read in one byte per
doc per field with norms, as a contiguous array of length maxDoc().
The workaround was a custom patch to do large file reads as several