Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-6970

Add DistributeRecord processor for distribute data by key hash

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Patch Available
    • Minor
    • Resolution: Unresolved
    • 1.10.0
    • None
    • Extensions
    • None

    Description

      Necessary to add Processor for distribute data over user specified relationships by distribution key/keys. Data is distributed across relationships in the amount proportional to the relationship weight. For example, if there are two relationships and the first has a weight of 9 while the second has a weight of 10, the first will be sent 9 / 19 parts of the rows, and the second will be sent 10 / 19.

      The row will be sent to the relationship that corresponds to the half-interval of the remainders from 'prev_weight' to 'prev_weights + weight', where 'prev_weights' is the total weight of the relationships with the smallest number, and 'weight' is the weight of this relationship." For example, if there are two relationships, and the first has a weight of 9 while the second has a weight of 10, the row will be sent to the first relationship for the remainders from the range [0, 9), and to the second for the remainders from the range [9, 19).

       

      It will help for loading data to distributed databases like clickhouse https://clickhouse.tech/docs/en/

      Attachments

        1. cluster_distribution.png
          74 kB
          Ilya Kovalev

        Issue Links

          Activity

            People

              Unassigned Unassigned
              skeleton Ilya Kovalev
              Votes:
              2 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1.5h
                  1.5h