Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-27610

Yarn external shuffle service fails to start when spark.shuffle.io.mode=EPOLL

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 2.4.2
    • Fix Version/s: 3.0.0
    • Component/s: Shuffle
    • Labels:
      None

      Description

      Enabling netty epoll mode in yarn shuffle service (spark.shuffle.io.mode=EPOLL) makes the Yarn NodeManager to abort.
      Checking the stracktrace, it seems that while the io.netty package is shaded, the native libraries provided by netty-all are not:
       

      Caused by: java.io.FileNotFoundException: META-INF/native/liborg_spark_project_netty_transport_native_epoll_x86_64.so

      Full stack trace:

      2019-04-24 23:14:46,372 ERROR [main] nodemanager.NodeManager (NodeManager.java:initAndStartNodeManager(639)) - Error starting NodeManager
      java.lang.UnsatisfiedLinkError: failed to load the required native library
          at org.spark_project.io.netty.channel.epoll.Epoll.ensureAvailability(Epoll.java:81)
          at org.spark_project.io.netty.channel.epoll.EpollEventLoop.<clinit>(EpollEventLoop.java:55)
          at org.spark_project.io.netty.channel.epoll.EpollEventLoopGroup.newChild(EpollEventLoopGroup.java:134)
          at org.spark_project.io.netty.channel.epoll.EpollEventLoopGroup.newChild(EpollEventLoopGroup.java:35)
          at org.spark_project.io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:84)
          at org.spark_project.io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:58)
          at org.spark_project.io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:47)
          at org.spark_project.io.netty.channel.MultithreadEventLoopGroup.<init>(MultithreadEventLoopGroup.java:59)
          at org.spark_project.io.netty.channel.epoll.EpollEventLoopGroup.<init>(EpollEventLoopGroup.java:104)
          at org.spark_project.io.netty.channel.epoll.EpollEventLoopGroup.<init>(EpollEventLoopGroup.java:91)
          at org.spark_project.io.netty.channel.epoll.EpollEventLoopGroup.<init>(EpollEventLoopGroup.java:68)
          at org.apache.spark.network.util.NettyUtils.createEventLoop(NettyUtils.java:52)
          at org.apache.spark.network.server.TransportServer.init(TransportServer.java:95)
          at org.apache.spark.network.server.TransportServer.<init>(TransportServer.java:75)
          at org.apache.spark.network.TransportContext.createServer(TransportContext.java:108)
          at org.apache.spark.network.yarn.YarnShuffleService.serviceInit(YarnShuffleService.java:186)
          at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
          at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:147)
          at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
          at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
          at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:268)
          at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
          at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
          at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:357)
          at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
          at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:636)
          at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:684)
      Caused by: java.lang.UnsatisfiedLinkError: could not load a native library: org_spark_project_netty_transport_native_epoll_x86_64
          at org.spark_project.io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:205)
          at org.spark_project.io.netty.channel.epoll.Native.loadNativeLibrary(Native.java:207)
          at org.spark_project.io.netty.channel.epoll.Native.<clinit>(Native.java:65)
          at org.spark_project.io.netty.channel.epoll.Epoll.<clinit>(Epoll.java:33)
          ... 26 more
          Suppressed: java.lang.UnsatisfiedLinkError: could not load a native library: org_spark_project_netty_transport_native_epoll
              at org.spark_project.io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:205)
              at org.spark_project.io.netty.channel.epoll.Native.loadNativeLibrary(Native.java:210)
              ... 28 more
          Caused by: java.io.FileNotFoundException: META-INF/native/liborg_spark_project_netty_transport_native_epoll.so
              at org.spark_project.io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:161)
              ... 29 more
              Suppressed: java.lang.UnsatisfiedLinkError: no org_spark_project_netty_transport_native_epoll in java.library.path
                  at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
                  at java.lang.Runtime.loadLibrary0(Runtime.java:870)
                  at java.lang.System.loadLibrary(System.java:1122)
                  at org.spark_project.io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:38)
                  at org.spark_project.io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:243)
                  at org.spark_project.io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:124)
                  ... 29 more
                  Suppressed: java.lang.UnsatisfiedLinkError: no org_spark_project_netty_transport_native_epoll in java.library.path
                      at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
                      at java.lang.Runtime.loadLibrary0(Runtime.java:870)
                      at java.lang.System.loadLibrary(System.java:1122)
                      at org.spark_project.io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:38)
                      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
                      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                      at java.lang.reflect.Method.invoke(Method.java:498)
                      at org.spark_project.io.netty.util.internal.NativeLibraryLoader$1.run(NativeLibraryLoader.java:263)
                      at java.security.AccessController.doPrivileged(Native Method)
                      at org.spark_project.io.netty.util.internal.NativeLibraryLoader.loadLibraryByHelper(NativeLibraryLoader.java:255)
                      at org.spark_project.io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:233)
                      ... 30 more
      Caused by: java.io.FileNotFoundException: META-INF/native/liborg_spark_project_netty_transport_native_epoll_x86_64.so
          at org.spark_project.io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:161)
          ... 29 more
          Suppressed: java.lang.UnsatisfiedLinkError: no org_spark_project_netty_transport_native_epoll_x86_64 in java.library.path
              at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
              at java.lang.Runtime.loadLibrary0(Runtime.java:870)
              at java.lang.System.loadLibrary(System.java:1122)
              at org.spark_project.io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:38)
              at org.spark_project.io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:243)
              at org.spark_project.io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:124)
              ... 29 more
              Suppressed: java.lang.UnsatisfiedLinkError: no org_spark_project_netty_transport_native_epoll_x86_64 in java.library.path
                  at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
                  at java.lang.Runtime.loadLibrary0(Runtime.java:870)
                  at java.lang.System.loadLibrary(System.java:1122)
                  at org.spark_project.io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:38)
                  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
                  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                  at java.lang.reflect.Method.invoke(Method.java:498)
                  at org.spark_project.io.netty.util.internal.NativeLibraryLoader$1.run(NativeLibraryLoader.java:263)
                  at java.security.AccessController.doPrivileged(Native Method)
                  at org.spark_project.io.netty.util.internal.NativeLibraryLoader.loadLibraryByHelper(NativeLibraryLoader.java:255)
                  at org.spark_project.io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:233)
      
      

        Attachments

          Activity

            People

            • Assignee:
              amuraru Adrian Muraru
              Reporter:
              amuraru Adrian Muraru
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: