Uploaded image for project: 'Commons Compress'
  1. Commons Compress
  2. COMPRESS-646

Improve performance of the Snappy Framed I/O streams

    XMLWordPrintableJSON

Details

    • Wish
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 1.22
    • None
    • Compressors
    • None
    • java 11.0.2 (openjdk )
      tested on both Windows 10 and linux (Ubuntu 20.04)

    Description

      Hello,

      I've been using the snappy format as a way to quickly compress/decompress json files, and have been using the
      FramedSnappyCompressorOutputStream and
      FramedSnappyCompressorInputStream provided by Apache Compress to do so since I already had several dependencies to apache.compress module.

      Although the compression/decompression works fine for every file, feedback regarding performance issues for large files started to emerge.

      The performance of these streams was very underwhelming upon testing.

      The decompression of a 90MB json.sz file (1.5 Gb decompressed .json ) was taking 2minutes, which is far from the expected perfomances of a snappy stream which  "[...] does not aim for maximum compression, or compatibility with any other compression library; instead, it aims for very high speeds and reasonable compression.".

      Switching to xerial/snappy-java 's Framed IO Streams reduced the compression/decompression times by two orders of magnitude.

      Running the same code in the provided Tools.java through a maven command took 1.5sec by replacing the Stream implementation to org.xerial.snappy.SnappyFramedInputStream , versus a consistent 125+secs with FramedSnappyCompressorInputStream.

      Since it's not a bug, i'm not flagging this ticket as such but it makes the usage of the apache compress library pointless for that format, and even counter-productive.

      Having performances up to par with other implementations, or the decompressor to be deprecated would be greatly appreciated.

      I've tried to upload the aforementionned file, but Jira refuses to take as the direct upload limit is 60mb. I should however be able to provide a 40-ish mb file if necessary.

      Best Regards,

      Mehdi Ennaïme

      Attachments

        1. Tools.java
          3 kB
          Mehdi Ennaime

        Activity

          People

            Unassigned Unassigned
            mehdiennaime Mehdi Ennaime
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: