Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-6249

Vendored gRPC doesn't seem to work with dataflow

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.9.0
    • Fix Version/s: None
    • Component/s: runner-dataflow
    • Labels:

      Description

      I attempted to migrate an existing pipeline (that worked in 2.8.0) to 2.9.0.  This pipeline is using the experimental streaming engine (–experiments=enable_streaming_engine).

      The pipeline fails to start with these logs:

      D  Unable to load the library 'org_apache_beam_vendor_grpc_v1_13_1_netty_tcnative_linux_x86_64', trying other loading mechanism. 
      D  org_apache_beam_vendor_grpc_v1_13_1_netty_tcnative_linux_x86_64 cannot be loaded from java.libary.path, now trying export to -Dio.netty.native.workdir: /tmp 
      D  Unable to load the library '/tmp/liborg_apache_beam_vendor_grpc_v1_13_1_netty_tcnative_linux_x86_646918605450681921540.so', trying other loading mechanism. 
      D  Unable to load the library 'netty_tcnative_linux_x86_64', trying next name... 
      D  Unable to load the library 'org_apache_beam_vendor_grpc_v1_13_1_netty_tcnative_linux_x86_64_fedora', trying other loading mechanism. 
      D  org_apache_beam_vendor_grpc_v1_13_1_netty_tcnative_linux_x86_64_fedora cannot be loaded from java.libary.path, now trying export to -Dio.netty.native.workdir: /tmp 
      D  Unable to load the library 'netty_tcnative_linux_x86_64_fedora', trying next name... 
      D  Unable to load the library 'org_apache_beam_vendor_grpc_v1_13_1_netty_tcnative_x86_64', trying other loading mechanism. 
      D  org_apache_beam_vendor_grpc_v1_13_1_netty_tcnative_x86_64 cannot be loaded from java.libary.path, now trying export to -Dio.netty.native.workdir: /tmp 
      D  Unable to load the library 'netty_tcnative_x86_64', trying next name... 
      D  Unable to load the library 'org_apache_beam_vendor_grpc_v1_13_1_netty_tcnative', trying other loading mechanism. 
      D  org_apache_beam_vendor_grpc_v1_13_1_netty_tcnative cannot be loaded from java.libary.path, now trying export to -Dio.netty.native.workdir: /tmp 
      D  Unable to load the library 'netty_tcnative', trying next name... 
      D  Failed to load netty-tcnative; OpenSslEngine will be unavailable, unless the application has already loaded the symbols by some other means. See http://netty.io/wiki/forked-tomcat-native.html for more information. 
      D  Failed to initialize netty-tcnative; OpenSslEngine will be unavailable. See http://netty.io/wiki/forked-tomcat-native.html for more information. 
      I  netty-tcnative unavailable (this may be normal) 
      I  Conscrypt not found (this may be normal) 
      I  Jetty ALPN unavailable (this may be normal) 
      E  Uncaught exception in main thread. Exiting with status code 1. 
      W  Please use a logger instead of System.out or System.err.
      Please switch to using org.slf4j.Logger.
      See: https://cloud.google.com/dataflow/pipelines/logging 
      E  Uncaught exception in main thread. Exiting with status code 1. 
      E  java.lang.IllegalStateException: Could not find TLS ALPN provider; no working netty-tcnative, Conscrypt, or Jetty NPN/ALPN available 
      E  	at org.apache.beam.vendor.grpc.v1_13_1.io.grpc.netty.GrpcSslContexts.defaultSslProvider(GrpcSslContexts.java:256) 
      E  	at org.apache.beam.vendor.grpc.v1_13_1.io.grpc.netty.GrpcSslContexts.configure(GrpcSslContexts.java:171) 
      E  	at org.apache.beam.vendor.grpc.v1_13_1.io.grpc.netty.GrpcSslContexts.forClient(GrpcSslContexts.java:120) 
      E  	at org.apache.beam.runners.dataflow.worker.windmill.GrpcWindmillServer.remoteChannel(GrpcWindmillServer.java:343) 
      E  	at org.apache.beam.runners.dataflow.worker.windmill.GrpcWindmillServer.initializeWindmillService(GrpcWindmillServer.java:312) 
      

       

      The interesting part is in the netty load failure, the stack trace is:

      exception: "java.lang.UnsatisfiedLinkError at org.apache.beam.vendor.grpc.v1_13_1.io.netty.util.internal.NativeLibraryLoader.loadLibraryByHelper(NativeLibraryLoader.java:276) at org.apache.beam.vendor.grpc.v1_13_1.io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:233) at org.apache.beam.vendor.grpc.v1_13_1.io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:187) at org.apache.beam.vendor.grpc.v1_13_1.io.netty.util.internal.NativeLibraryLoader.loadFirstAvailable(NativeLibraryLoader.java:85) at org.apache.beam.vendor.grpc.v1_13_1.io.netty.handler.ssl.OpenSsl.loadTcNative(OpenSsl.java:430) at org.apache.beam.vendor.grpc.v1_13_1.io.netty.handler.ssl.OpenSsl.<clinit>(OpenSsl.java:97) at org.apache.beam.vendor.grpc.v1_13_1.io.grpc.netty.GrpcSslContexts.defaultSslProvider(GrpcSslContexts.java:242) at org.apache.beam.vendor.grpc.v1_13_1.io.grpc.netty.GrpcSslContexts.configure(GrpcSslContexts.java:171) at org.apache.beam.vendor.grpc.v1_13_1.io.grpc.netty.GrpcSslContexts.forClient(GrpcSslContexts.java:120) at org.apache.beam.runners.dataflow.worker.windmill.GrpcWindmillServer.remoteChannel(GrpcWindmillServer.java:343) at org.apache.beam.runners.dataflow.worker.windmill.GrpcWindmillServer.initializeWindmillService(GrpcWindmillServer.java:312) at org.apache.beam.runners.dataflow.worker.windmill.GrpcWindmillServer.setWindmillServiceEndpoints(GrpcWindmillServer.java:192) at org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.getConfigFromDataflowService(StreamingDataflowWorker.java:1528) at org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.getConfig(StreamingDataflowWorker.java:1583) at org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.getGlobalConfig(StreamingDataflowWorker.java:1568) at org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.schedulePeriodicGlobalConfigRequests(StreamingDataflowWorker.java:1543) at org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.start(StreamingDataflowWorker.java:704) at org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.main(StreamingDataflowWorker.java:228) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.beam.vendor.grpc.v1_13_1.io.netty.util.internal.NativeLibraryLoader$1.run(NativeLibraryLoader.java:263) at java.security.AccessController.doPrivileged(Native Method) at org.apache.beam.vendor.grpc.v1_13_1.io.netty.util.internal.NativeLibraryLoader.loadLibraryByHelper(NativeLibraryLoader.java:255) ... 17 more Caused by: java.lang.NoClassDefFoundError: org/apache/beam/vendor/grpc/v1/13/1/io/netty/internal/tcnative/Library at java.lang.ClassLoader$NativeLibrary.load(Native Method) at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941) at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824) at java.lang.Runtime.load0(Runtime.java:809) at java.lang.System.load(System.java:1086) at org.apache.beam.vendor.grpc.v1_13_1.io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:36) ... 24 more Caused by: java.lang.ClassNotFoundException: org.apache.beam.vendor.grpc.v1.13.1.io.netty.internal.tcnative.Library at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 30 more

       

      Notice that the class attempting to be loaded is:

      org.apache.beam.vendor.grpc.v1.13.1.io.netty.internal.tcnative.Library, but it's actually defined in the jar as org.apache.beam.vendor.grpc.v1_13_1.io.netty.internal.tcnative.Library.

      I traced this back to the jni interop code in tcnative:

      https://github.com/netty/netty-tcnative/blob/master/openssl-dynamic/src/main/c/jnilib.c#L266

      Here it replaces all _ in the package prefix with /, which won't work here.  The fix seems like it would be to repackage the vendored gRPC with a different prefix that doesn't contain underscores.

      I'm curious how this ever worked though?  Maybe the streaming engine is the only thing using this vendored gRPC code?

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                takidau Tyler Akidau
                Reporter:
                SteveNiemitz Steve Niemitz
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h
                  1h