Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-5591

Enable compression of Avro in ExecuteSQL

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.7.1
    • 1.8.0
    • Extensions
    • macOS, Java 8

    Description

      The Avro stream that comes out of the ExecuteSQL processor is uncompressed. It's possible to rewrite it compressed using a combination of ConvertRecord processor with AvroReader and AvroRecordSetWriter, but that's a lot of extra I/O that could be handled transparently at the moment that the Avro data is created.

      For implementation, it looks like ExecuteSQL builds a set of JdbcCommon.AvroConvertionOptionshere. That options object would need to gain a compression flag. Then, within JdbcCommon#convertToAvroStream here, the dataFileWriter would get a codec set by setCodec, with the codec having been created shortly before.

      For example of creating the codec, I looked at how the AvroRecordSetWriter does it. The setCodec() is performed here after the codec is created by factory option here using a factory method here.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            pvillard Pierre Villard
            colindean Colin Dean
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment