Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-1197

Slowly-changing external data as a side input

Details

    • Wish
    • Status: Resolved
    • P3
    • Resolution: Duplicate
    • None
    • Missing
    • beam-model
    • None

    Description

      I've seen repeatedly the following pattern: a user wants to join a PCollection against a slowly-changing external dataset: e.g. a file on GCS, or a Bigtable, etc.

      Side inputs come to mind, but current side input mechanisms don't allow for something like periodically reloading the side input.

      The best hacky solution I came up with for one use case is documented here: http://stackoverflow.com/questions/41254028/can-dataflow-sideinput-be-updated-per-window-by-reading-a-gcs-bucket/41271159#41271159 , we need to do better than this.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              jkff Eugene Kirpichov
              Votes:
              1 Vote for this issue
              Watchers:
              20 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: