Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-5591

Enable compression of Avro in ExecuteSQL

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.7.1
    • Fix Version/s: 1.8.0
    • Component/s: Extensions
    • Environment:
      macOS, Java 8

      Description

      The Avro stream that comes out of the ExecuteSQL processor is uncompressed. It's possible to rewrite it compressed using a combination of ConvertRecord processor with AvroReader and AvroRecordSetWriter, but that's a lot of extra I/O that could be handled transparently at the moment that the Avro data is created.

      For implementation, it looks like ExecuteSQL builds a set of JdbcCommon.AvroConvertionOptionshere. That options object would need to gain a compression flag. Then, within JdbcCommon#convertToAvroStream here, the dataFileWriter would get a codec set by setCodec, with the codec having been created shortly before.

      For example of creating the codec, I looked at how the AvroRecordSetWriter does it. The setCodec() is performed here after the codec is created by factory option here using a factory method here.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                pvillard Pierre Villard
                Reporter:
                colindean Colin Dean
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: