Details
-
Bug
-
Status: Resolved
-
P1
-
Resolution: Fixed
-
None
-
None
Description
I have identified a seemingly nondeterministic issue in Dataflow translation, where pipelines with side inputs sometimes are translated in the wrong order.
java.lang.NullPointerException: Unknown producer for value SimplePCollectionView{tag=Tag<org.apache.beam.sdk.values.PCollectionViews$SimplePCollectionView.<init>:1221#4dca087078898728>} while translating step TfIdf.ComputeTfIdf/Combine.globally(Count)/ProduceDefault at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:1227)
Seen on https://ci-beam.apache.org/job/beam_PostCommit_Java_Examples_Dataflow_V2_PR/32/testReport/junit/org.apache.beam.examples.complete/TfIdfIT/testE2ETfIdf/ and also other changes. I think the change itself is just triggering the nondeterministic problem.
So there is a lurking problem with side inputs overall.