Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.7.1
-
macOS, Java 8
Description
The Avro stream that comes out of the ExecuteSQL processor is uncompressed. It's possible to rewrite it compressed using a combination of ConvertRecord processor with AvroReader and AvroRecordSetWriter, but that's a lot of extra I/O that could be handled transparently at the moment that the Avro data is created.
For implementation, it looks like ExecuteSQL builds a set of JdbcCommon.AvroConvertionOptionshere. That options object would need to gain a compression flag. Then, within JdbcCommon#convertToAvroStream here, the dataFileWriter would get a codec set by setCodec, with the codec having been created shortly before.
For example of creating the codec, I looked at how the AvroRecordSetWriter does it. The setCodec() is performed here after the codec is created by factory option here using a factory method here.
Attachments
Issue Links
- links to