Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-4675

PublishKafka_0_10 can't use demarcator and kafka key at the same time

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.2.0
    • 1.5.0
    • Core Framework

    Description

      At the moment you can't split up a flowfile using a demarcator AND set the Kafka key (kafka.key) attribute for all resulting Kafka records at the same time. The code explicitly prevents this.

      Still it would be a valuable performance booster to have the ability to use both at the same time in all cases where 1 flowfile contains many individual kafka records. Flowfiles would not have to be pre split (explosion of NiFi overhead) if you want to set the key.

      Note:
      Using demarcator and kafka key at the same time will normally make every resulting kafka record from 1 incoming flowfile to have the same kafka key (see REMARK).

      I know a live NiFi deployment where this fix/feature (provided as custom fix) led to a 500 - 600% increase in throughput. Others could and should benefit as well.

      REMARK
      The argument against this feature has been that it is not a good idea to intentionally generate many duplicate Kafka keys. I would argue that it is up to the user to decide. Most would use Kafka as a pure distributed log system and key uniqueness is not important. The kafka key can be really valuable grouping placeholder though. The only case where this would get problematic is on compaction of Kafka topics when kafka keys are deduplicated. But after we put sufficient warnings and disclaimers for this risk in the tooltips it is up to the user to decide whether to use the performance booster.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              jasperknulst Jasper Knulst
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: