Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-12215

DirectRunner NullPointerException

Details

    • Bug
    • Status: Triage Needed
    • P3
    • Resolution: Unresolved
    • 2.28.0
    • None
    • None

    Description

      When using the DirectRunner in tests I encountered flaky tests due to NullPointerException.

      When the test fails the relevant stack trace is like below:

       

       

      java.lang.NullPointerException
       at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:877)
       at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.ThreadFactoryBuilder.setThreadFactory(ThreadFactoryBuilder.java:132)
       at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:192)
       at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:67)
       at org.apache.beam.sdk.Pipeline.run(Pipeline.java:322)
       at google.registry.beam.TestPipelineExtension.run(TestPipelineExtension.java:344)
       at google.registry.beam.TestPipelineExtension.run(TestPipelineExtension.java:324)
       at google.registry.beam.spec11.Spec11PipelineTest.testSuccess_saveToGcs(Spec11PipelineTest.java:185)

       

       
      However the null pointer seems to have come from calling (a repackaged and very innocent looking) guava function to set the ThreadFactory:
       

       ExecutorService metricsPool =
       Executors.newCachedThreadPool(
       new ThreadFactoryBuilder()
      .setThreadFactory(MoreExecutors.platformThreadFactory())
       .setDaemon(false) // otherwise you say you want to leak, please don't!
       .setNameFormat("direct-metrics-counter-committer")
       .build());

       
      Which in turn is just calling java.util.concurrent.Executors.defaulThreatFactory() when not on App Engine:
       

      @Beta
      @GwtIncompatible
      public static ThreadFactory platformThreadFactory() {
       if (!isAppEngine()) {
        return Executors.defaultThreadFactory();
        } else {
       ...
      } 

       

      I don't understand how this could possibly have failed. The test seems to be only flaky on our CI servers running Ubuntu and I haven't been able to reproduce it locally, making debugging a challenge.

       

      I'd appreciate any input you may have. Thanks!

      Attachments

        Activity

          People

            Unassigned Unassigned
            jianglai Lai Jiang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: