Uploaded image for project: 'Apache Hop (incubating)'
  1. Apache Hop (incubating)
  2. HOP-2302

Stream Lookup: allow it to be run partitioned

    XMLWordPrintableJSON

    Details

      Description

      Currently it's not possible to run the Stream Lookup transform partitioned.

      In those cases where we can partition on the lookup key it would be beneficial to do so.

      One condition could be that the transform where the source data is read from is also partitioned on the same key.

      For example, if we have a lookup set of  id/value of [1/A,2/B,3/C,4/D,5/E,6/F,7/G...]

      We could partition the Stream Lookup transform by id on 3 partitions.

      Given input lookupId [1,2,3,4,5,6....] this means that on copy 0 we'd have ids [3,6,9, ...], on copy 1 [1,4,7,...] and on 2 [2,5,8,...]

      So given that simple constraint we could specify that the source transform delivering the lookup data also needs to be partitioned on the same partitioning schema, method and field.

      If that condition is in effect, copy 0 could simply read from source copy 0, copy 1 reading from copy 1 and so on.

       

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              mcasters Matt Casters
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: