Uploaded image for project: 'Apache Hop (Retired)'
  1. Apache Hop (Retired)
  2. HOP-2302

Stream Lookup: allow it to be run partitioned

Details

    Description

      Currently it's not possible to run the Stream Lookup transform partitioned.

      In those cases where we can partition on the lookup key it would be beneficial to do so.

      One condition could be that the transform where the source data is read from is also partitioned on the same key.

      For example, if we have a lookup set of  id/value of [1/A,2/B,3/C,4/D,5/E,6/F,7/G...]

      We could partition the Stream Lookup transform by id on 3 partitions.

      Given input lookupId [1,2,3,4,5,6....] this means that on copy 0 we'd have ids [3,6,9, ...], on copy 1 [1,4,7,...] and on 2 [2,5,8,...]

      So given that simple constraint we could specify that the source transform delivering the lookup data also needs to be partitioned on the same partitioning schema, method and field.

      If that condition is in effect, copy 0 could simply read from source copy 0, copy 1 reading from copy 1 and so on.

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            mcasters Matt Casters
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment