Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-13230

Deduplicate transform fails on Dataflow Runner v2

Details

    • Bug
    • Status: Open
    • P3
    • Resolution: Unresolved
    • None
    • None
    • runner-dataflow
    • None

    Description

      Deduplicate transform does not work when used with Dataflow Runner v2. The following error is raised:

      generic::unknown: org.apache.beam.sdk.util.UserCodeException: java.lang.IllegalArgumentException: Attempted to set an event-time timer with an output timestamp of 294247-01-09T04:00:54.775Z that is after the timer firing timestamp 2021-11-12T18:55:12.516Z
      	at org.apache.beam.sdk.util.UserCodeException.wrap(UserCodeException.java:39)
      	at org.apache.beam.sdk.transforms.Deduplicate$DeduplicateFn$DoFnInvoker.invokeProcessElement(Unknown Source)
      	at org.apache.beam.fn.harness.FnApiDoFnRunner.processElementForWindowObservingParDo(FnApiDoFnRunner.java:771)
      	at org.apache.beam.fn.harness.data.PCollectionConsumerRegistry$MetricTrackingFnDataReceiver.accept(PCollectionConsumerRegistry.java:257)
      	at org.apache.beam.fn.harness.data.PCollectionConsumerRegistry$MetricTrackingFnDataReceiver.accept(PCollectionConsumerRegistry.java:209)
      	at org.apache.beam.fn.harness.BeamFnDataReadRunner.forwardElementToConsumer(BeamFnDataReadRunner.java:172)
      	at org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver2.awaitCompletion(BeamFnDataInboundObserver2.java:126)
      	at org.apache.beam.fn.harness.control.ProcessBundleHandler.processBundle(ProcessBundleHandler.java:467)
      	at org.apache.beam.fn.harness.control.BeamFnControlClient.delegateOnInstructionRequestType(BeamFnControlClient.java:151)
      	at org.apache.beam.fn.harness.control.BeamFnControlClient$InboundObserver.lambda$onNext$0(BeamFnControlClient.java:116)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)
      Caused by: java.lang.IllegalArgumentException: Attempted to set an event-time timer with an output timestamp of 294247-01-09T04:00:54.775Z that is after the timer firing timestamp 2021-11-12T18:55:12.516Z
      	at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument(Preconditions.java:440)
      	at org.apache.beam.fn.harness.FnApiDoFnRunner$FnApiTimer.getTimerForTime(FnApiDoFnRunner.java:1914)
      	at org.apache.beam.fn.harness.FnApiDoFnRunner$FnApiTimer.setRelative(FnApiDoFnRunner.java:1839)
      	at org.apache.beam.sdk.transforms.Deduplicate$DeduplicateFn.processElement(Deduplicate.java:318)
      

      Relevant recent change to Deduplicate: https://github.com/apache/beam/commit/ce3a5545e1ac5a655a2c01374b89c08bf5b3e34a#diff-6a2e50eb57656ea50a5faa1a0346af656bee517103c3320e0ad08d6cdb2778b5

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              bhulette Brian Hulette
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: