The fundamental problem is that Commons Compress does decompression via CompressorInputStream’s read() methods, which are a pull-model interface, while the LZMA SDK (in the public domain) does it with Decoder.code(), a method that takes a compressed input stream and an output stream to decompress to, then reads, decompresses, and writes, only returning when the entire file is decompressed. There is no way to convert this to a pull-model CompressorInputStream: either you have to pull in one thread while pushing from another, or push everything into a ByteArrayInputStream (which needs O(n) memory!!) and then pull from that afterwards. Both are really ugly solutions: thread per stream is heavy and creating new threads is not allowed in some environments (eg. unsigned Applets and Java EE servers), while trying to allocate O(n) memory can OutOfMemoryError the entire JVM.
The Java LZMA attempts out there rate as follows:
Maurel’s patch here uses O(n) memory, and decompresses the entire stream in the constructor and stores it in a ByteArrayInputStream which is then copied from on each read().
http://jponge.github.io/lzma-java/ is licensed ASLv2 and states how it solved the push/pull problem: “Although not a derivate work, the streaming api classes were inspired from the work of Christopher League. I reused his technique of fake streams and working threads to pass the data around between encoders/decoders and "normal" Java streams.” In other words, it pushes in one thread and pulls in another. Actual decompression in the other thread is still done with the LZMA SDK, which it just wraps into an InputStream subclass.
http://contrapunctus.net/league/haques/lzmajio/ was done by Christopher League, it’s under “LGPL or the Common Public License” and has the same push in one thread pull in another story. It’s also just a wrapper of the LZMA SDK.
http://tukaani.org/xz/java.html is in the public domain and is already used by Commons Compress to provide XZ compression support. It supports XZ and LZMA2 only and supports them well - proper pull-model InputStream with no O(n) memory or background threads. LZMA2 is a different file format from LZMA. But then again LZMA2 uses LZMA internally. I’ll have to investigate in detail.