Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-6964

Use compression level for xz-lzma2 format of the CompressContent processor

    XMLWordPrintableJSON

    Details

    • Flags:
      Patch

      Description

      The CompressContent processor does not use the Compression Level property of the processor except for when using the GZIP compression format. On the contrary, the xz-lzma2 compression format defaults to using XZ compression level 6 for that specific format (I read the CompressContent.java source code to verify this) – disregarding whatever compression level you set on the processor itself.

      As a side note, the xz compression format supports, amazingly enough, 10 levels of compression from 0 to 9 – the same as GZIP. The only difference that I can tell is level 0 of xz is not the lack of compression, but the lightest compression possible (i.e. still some compression) – whereas GZIP compression level 0 means just container the content but do not compress.

      I have a use case where I must use the xz-lzma2 format (don't ask why) and I have to send (using the XZ format) already highly-compressed content that is NOT XZ format to begin with. I have in excess of 500 GB of this sort of already highly compressed content to further compress into the XZ format on a daily basis.

      The attached patch will enhance the CompressContent.java source code enabling the compression level property to be used in both the GZIP and the XZ-LZMA2 formats.

      Please consider adding this patch to the baseline for this processor. I've tested it and the results are fantastic because I can crank down the compression level to 0 for XZ-LZMA2 now and use a lot less CPU. I'm generally seeing a 66% improvement in elapsed time to process highly compressed content using XZ format with compression level of 0 versus the hard-coded level 6 of the baseline code.

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                alphasupremicus John Pierce
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - 4h
                  4h
                  Remaining:
                  Time Spent - 0.5h Remaining Estimate - 3.5h
                  3.5h
                  Logged:
                  Time Spent - 0.5h Remaining Estimate - 3.5h
                  0.5h