Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-27491

SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty response! therefore Airflow won't integrate with Spark 2.3.x

    Details

      Description

      This issue must have been introduced after Spark 2.1.1 as it is working in that version. This issue is affecting me in Spark 2.3.3/2.3.0. I am using spark standalone mode if that makes a difference.

      See below spark 2.3.3 returns empty response while 2.1.1 returns a response.

       

      Spark 2.1.1:

      [ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home1/bin/spark-class org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status driver-20190417130324-0009
      + export SPARK_HOME=/home/ec2here/spark_home1
      + SPARK_HOME=/home/ec2here/spark_home1
      + '[' -z /home/ec2here/spark_home1 ']'
      + . /home/ec2here/spark_home1/bin/load-spark-env.sh
      ++ '[' -z /home/ec2here/spark_home1 ']'
      ++ '[' -z '' ']'
      ++ export SPARK_ENV_LOADED=1
      ++ SPARK_ENV_LOADED=1
      ++ parent_dir=/home/ec2here/spark_home1
      ++ user_conf_dir=/home/ec2here/spark_home1/conf
      ++ '[' -f /home/ec2here/spark_home1/conf/spark-env.sh ']'
      ++ set -a
      ++ . /home/ec2here/spark_home1/conf/spark-env.sh
      +++ export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64
      +++ JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64
      ++++ ulimit -n 1048576
      ++ set +a
      ++ '[' -z '' ']'
      ++ ASSEMBLY_DIR2=/home/ec2here/spark_home1/assembly/target/scala-2.11
      ++ ASSEMBLY_DIR1=/home/ec2here/spark_home1/assembly/target/scala-2.10
      ++ [[ -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ]]
      ++ '[' -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ']'
      ++ export SPARK_SCALA_VERSION=2.10
      ++ SPARK_SCALA_VERSION=2.10
      + '[' -n /usr/lib/jvm/jre-1.8.0-openjdk.x86_64 ']'
      + RUNNER=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java
      + '[' -d /home/ec2here/spark_home1/jars ']'
      + SPARK_JARS_DIR=/home/ec2here/spark_home1/jars
      + '[' '!' -d /home/ec2here/spark_home1/jars ']'
      + LAUNCH_CLASSPATH='/home/ec2here/spark_home1/jars/*'
      + '[' -n '' ']'
      + [[ -n '' ]]
      + CMD=()
      + IFS=
      + read -d '' -r ARG
      ++ build_command org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status driver-20190417130324-0009
      ++ /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -Xmx128m -cp '/home/ec2here/spark_home1/jars/*' org.apache.spark.launcher.Main org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status driver-20190417130324-0009
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      ++ printf '%d\0' 0
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + COUNT=10
      + LAST=9
      + LAUNCHER_EXIT_CODE=0
      + [[ 0 =~ ^[0-9]+$ ]]
      + '[' 0 '!=' 0 ']'
      + CMD=("${CMD[@]:0:$LAST}")
      + exec /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -cp '/home/ec2here/spark_home1/conf/:/home/ec2here/spark_home1/jars/*' -Xmx2048m org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status driver-20190417130324-0009
      Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
      19/04/17 14:03:27 INFO RestSubmissionClient: Submitting a request for the status of submission driver-20190417130324-0009 in spark://domainhere:6066.
      19/04/17 14:03:28 INFO RestSubmissionClient: Server responded with SubmissionStatusResponse:

      { "action" : "SubmissionStatusResponse", "driverState" : "FAILED", "serverSparkVersion" : "2.3.3", "submissionId" : "driver-20190417130324-0009", "success" : true, "workerHostPort" : "x.y.211.40:11819", "workerId" : "worker-20190417115840-x.y.211.40-11819" }

      [ec2here@ip-x-y-160-225 ~]$

       

      Spark 2.3.3:

      [ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home/bin/spark-class org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status driver-20190417130324-0009
      + '[' -z '' ']'
      ++ dirname /home/ec2here/spark_home/bin/spark-class
      + source /home/ec2here/spark_home/bin/find-spark-home
      ++++ dirname /home/ec2here/spark_home/bin/spark-class
      +++ cd /home/ec2here/spark_home/bin
      +++ pwd
      ++ FIND_SPARK_HOME_PYTHON_SCRIPT=/home/ec2here/spark_home/bin/find_spark_home.py
      ++ '[' '!' -z '' ']'
      ++ '[' '!' -f /home/ec2here/spark_home/bin/find_spark_home.py ']'
      ++++ dirname /home/ec2here/spark_home/bin/spark-class
      +++ cd /home/ec2here/spark_home/bin/..
      +++ pwd
      ++ export SPARK_HOME=/home/ec2here/spark_home
      ++ SPARK_HOME=/home/ec2here/spark_home
      + . /home/ec2here/spark_home/bin/load-spark-env.sh
      ++ '[' -z /home/ec2here/spark_home ']'
      ++ '[' -z '' ']'
      ++ export SPARK_ENV_LOADED=1
      ++ SPARK_ENV_LOADED=1
      ++ export SPARK_CONF_DIR=/home/ec2here/spark_home/conf
      ++ SPARK_CONF_DIR=/home/ec2here/spark_home/conf
      ++ '[' -f /home/ec2here/spark_home/conf/spark-env.sh ']'
      ++ set -a
      ++ . /home/ec2here/spark_home/conf/spark-env.sh
      +++ export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64
      +++ JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64
      ++++ ulimit -n 1048576
      ++ set +a
      ++ '[' -z '' ']'
      ++ ASSEMBLY_DIR2=/home/ec2here/spark_home/assembly/target/scala-2.11
      ++ ASSEMBLY_DIR1=/home/ec2here/spark_home/assembly/target/scala-2.12
      ++ [[ -d /home/ec2here/spark_home/assembly/target/scala-2.11 ]]
      ++ '[' -d /home/ec2here/spark_home/assembly/target/scala-2.11 ']'
      ++ export SPARK_SCALA_VERSION=2.12
      ++ SPARK_SCALA_VERSION=2.12
      + '[' -n /usr/lib/jvm/jre-1.8.0-openjdk.x86_64 ']'
      + RUNNER=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java
      + '[' -d /home/ec2here/spark_home/jars ']'
      + SPARK_JARS_DIR=/home/ec2here/spark_home/jars
      + '[' '!' -d /home/ec2here/spark_home/jars ']'
      + LAUNCH_CLASSPATH='/home/ec2here/spark_home/jars/*'
      + '[' -n '' ']'
      + [[ -n '' ]]
      + set +o posix
      + CMD=()
      + IFS=
      + read -d '' -r ARG
      ++ build_command org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status driver-20190417130324-0009
      ++ /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -Xmx128m -cp '/home/ec2here/spark_home/jars/*' org.apache.spark.launcher.Main org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status driver-20190417130324-0009
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      ++ printf '%d\0' 0
      + CMD+=("$ARG")
      + IFS=
      + read -d '' -r ARG
      + COUNT=10
      + LAST=9
      + LAUNCHER_EXIT_CODE=0
      + [[ 0 =~ ^[0-9]+$ ]]
      + '[' 0 '!=' 0 ']'
      + CMD=("${CMD[@]:0:$LAST}")
      + exec /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -cp '/home/ec2here/spark_home/conf/:/home/ec2here/spark_home/jars/*' -Xmx2048m org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status driver-20190417130324-0009
      [ec2here@ip-x-y-160-225 ~]$ ps -ef | grep -i spark

       

      This means Apache Airflow does not work with spark 2.3.x as the spark submit operator stays in running state forever as it does not get response from spark rest status calls.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              toopt4 t oo
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: