Thanks Jason Lowe. Those are excellent points, I completely agree that the patch introduced subtle differences if some of the streams throw exceptions upon close(). My previous reasoning was that in this case something's probably gone horribly and irrevocably wrong anyway. But following your comments, I prepared a more defensive patch, in which even if some of the close() or finish() methods throw exceptions we still try to close/recover what we can. The price of this is assuming that it's okay to call the close() method of a stream multiple times.
Other maintainers will come along and see that BZip2CompressionOutputStream never calls close() on one of its private member streams which is usually a bug. Even if the close() ends up being redundant today that doesn't mean it always will. The root cause for this JIRA is a great example.
In patch 002 I put back the BZip2CompressionOutputStream.close() method with a call to super.close() and some explanatory documentation. It still seems to me that calling super.close() should be sufficient, let me try to explain why.
I'm not seeing how the superclass's finish() method ever ends up closing the out stream. I see it write some final bytes and sets it to null, which in turn prevents the close() method from trying to call out.close(), so I'm wondering how the output stream normally gets closed.
My understanding is that the output stream does get closed, thanks to the out.close() call in CompressionOutputStream.close(). The out data member of CBZip2OutputStream is indeed nullified, but the out data member of CompressionOutputStream should still reference the actual stream object.
My reasoning was: the only difference between BZip2CompressionOutputStream.finish() and close() is that BZip2CompressionOutputStream.finish() calls output.finish(), whereas BZip2CompressionOutputStream.close() calls output.flush() and output.close(). Changing BZip2CompressionOutputStream.close() to super.close() will mean that we invoke finish() only instead of flush() and close(). Looking at CBZip2OutputStream (which can be the only class of the output data member in the current implementation), it seems to me that it's okay to invoke finish() instead of flush() + close(), because the only difference between them is calling out.flush() + out.close(). As I said above, out.close() will be called anyway by CompressionOutputStream.close(), and I'm assuming that any reasonable stream calls flush() internally on close().
In BZip2CompressionInputStream, patch 002 puts the call to super.close() in a finally block. This preserves the previous logic (set needsReset to true only if input.close() didn't throw) while ensuring that super.close() will unconditionally close the in stream and return the trackedDecompressor to the pool.
This is subtly different that the previous code because finish() can throw. In the old code, finish() could throw and out.close() would still be called but now we'll skip calling out.close() but still set closed=true so we can't retry the close. (...) Similarly the CompressionInputStream/CompressionOutputStream code won't return the codec to the pool if finish() throws or the underlying stream's close() throws.
In patch 002 I wrapped each of CompressionInput/OutputStream.close()'s internal steps in try/finally. (For CompressionOutputStream.close() this leaves the corner case of both finish() and out.close() throwing an exception each, but I think it's reasonable that only one of them will be propagated since it's a doomed stream anyway.) This brings the behavior of the patched CompressorStream.close() to what it was before my changes: if finish() throws, out.close() is still called.