Before it does anything else, ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream) loops over all futures returned by the creator`s executor service and calls Future#get(). This will block until the future's computation is completed, respectively - i.e., until all entries have been written to the thread-local scatter streams.
However, if the computation of a future fails, then Future#get() can also throw an exception. This exception escapes ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream) before the executor service is shut down. The latter means that also the thread-local variables in the executor service's threads and all objects referenced by them continue to exist and cannot be reclaimed by the GC.
I encountered this situation when - while processing an archive with 130,000 documents - the JVM threw an OutOfMemoryError. The application was not able to recover from this OOM error because most of the heap was occupied by objects reachable from the executor service's threads.
Of course, the OOM is mostly the fault of my own code; I will be able to work around the "leaked" executor service because I supply it in the first place and can therefore shut it down if I detect an error situation.
The effect would be the same, though, if, say, Future#get() throws an InterruptedException. Therefore, ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream) should either shut down and release all resources if it cannot complete its task due to an Exception thrown by a future or it should offer a reasonable recovery strategy.
- relates to
COMPRESS-470 ParallelScatterZipCreator leaks temporary files (and maybe more)