[NIFI-11466] Add a ModifyCompression processor - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 2.0.0-M1, 1.22.0
Component/s: Extensions
Labels:
None

Description

If a user would like to convert from one compression format to another, they currently have to use CompressContent to decompress, then another CompressContent to compress into a different format. Two processors plus disk I/O for the FlowFiles and their underlying content claims can be I/O intensive in that case.

Instead, a new ModifyCompression processor is proposed, to allow for both decompression of the incoming FlowFile and compression for the outgoing FlowFile, using appropriate memory buffers for the decompression/recompression. Adding "no decompression" and "no compression" options for the respective properties could allow this property to function like CompressContent does now, plus the ability to convert from one compression format (gzip, e.g.) to another (snappy-hadoop, e.g.). One example of a use case where this would be helpful is an I/O bound flow to get compressed data from a legacy source system into HDFS for faster (and larger-volume / distributed) processing of the data.

Attachments

Issue Links

links to

GitHub Pull Request #7180

Activity

People

Assignee:: Matt Burgess

Reporter:: Matt Burgess

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 17/Apr/23 19:22

Updated:: 30/Apr/23 02:13

Resolved:: 30/Apr/23 02:13

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

1h 20m