Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-10676

Timers use the input timestamp as the timer output timestamp which prevents watermark progress

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: P2
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 2.24.0
    • Component/s: sdk-py-core, sdk-py-harness
    • Labels:
      None

      Description

      By default, the Python SDK adds a timer output timestamp equal to the current timestamp of an element. This is problematic because

      1. We hold back the output watermark on the current element's timestamp for every timer
      2. It doesn't match the behavior in the Java SDK which defaults to using the fire timestamp as the timer output timestamp (and adds a hold on it)
      3. There is no way for the user to influence this behavior because there is no user-facing API

      https://github.com/apache/beam/blob/dfadde2d3ee0a0487362dbcca80388fdc2ef2302/sdks/python/apache_beam/runners/worker/bundle_processor.py#L650

      We should use the fire timestamp as the default output timestamp.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                mxm Maximilian Michels
                Reporter:
                mxm Maximilian Michels
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 20m
                  1h 20m