Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21494

Spark 2.2.0 AES encryption not working with External shuffle

Rank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      Spark’s new AES based authentication mechanism does not seem to work when configured with external shuffle service on YARN.

      Here is the stack trace for the error we see in the driver logs:
      ERROR YarnScheduler: Lost executor 40 on ip-10-167-104-125.ec2.internal: Unable to create executor due to Unable to register with external shuffle server due to: java.lang.IllegalArgumentException: Authentication failed.
      at org.apache.spark.network.crypto.AuthRpcHandler.receive(AuthRpcHandler.java:125)
      at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:157)
      at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:105)
      at org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:118)
      at org.spark_project.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
      at org.spark_project.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
      at org.spark_project.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
      at org.spark_project.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
      at org.spark_project.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
      at org.spark_project.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
      at org.spark_project.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
      at org.spark_project.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)

      Here are the settings we are configuring in ‘spark-defaults’ and ‘yarn-site’:
      spark.network.crypto.enabled true
      spark.network.crypto.saslFallback false
      spark.authenticate true

      Turning on DEBUG logs for class ‘org.apache.spark.network.crypto’ on both Spark and YARN side is not giving much information either about why authentication fails. The driver and node manager logs have been attached to the JIRA.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            vanzin Marcelo Masiero Vanzin
            uditme Udit Mehrotra
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment