Uploaded image for project: 'Bigtop'
  1. Bigtop
  2. BIGTOP-2074

spark-worker doesn't start during deploy from master

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.0.0
    • Fix Version/s: 1.1.0
    • Component/s: deployment
    • Labels:
      None
    • Environment:

      Official Bigtop ubuntu-14.04 docker image with Hiera 1.3.0

      Description

      spark-worker refuses to start automatically after puppet apply. The error message is as follows

      15/09/26 07:05:54 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkWorker@ignite.docker:7078]
      15/09/26 07:05:54 INFO Remoting: Remoting now listens on addresses: [akka.tcp://sparkWorker@ignite.docker:7078]
      15/09/26 07:05:54 INFO util.Utils: Successfully started service 'sparkWorker' on port 7078.
      Exception in thread "main" org.apache.spark.SparkException: Invalid master URL: spark://:7077
              at org.apache.spark.util.Utils$.extractHostPortFromSparkUrl(Utils.scala:1986)
              at org.apache.spark.deploy.master.Master$.toAkkaUrl(Master.scala:879)
      

        Issue Links

          Activity

          Hide
          evans_ye Evans Ye added a comment -

          Wait for a day. I've directly committed this.

          Show
          evans_ye Evans Ye added a comment - Wait for a day. I've directly committed this.
          Hide
          cos Konstantin Boudnik added a comment -

          Looks great! Thanks for taking care about this, Evans Ye!

          Show
          cos Konstantin Boudnik added a comment - Looks great! Thanks for taking care about this, Evans Ye !
          Hide
          evans_ye Evans Ye added a comment - - edited

          The bug is fairly complicated. [~Konstantin Boudnik] spots most of them:

          • absence of the setting of STANDALONE_SPARK_MASTER_HOST in the spark-env.sh
          • export SPARK_MASTER_IP=<%= @master_port %>

          Another bug I found is that all the code in spark init.pp are lookning for $common::master_host, but it's $common::spark_master_host in cluster.yaml instead.

          The following is fine since the master_host is set to $fqdn only if $spark::common::master_host is missing, hence $fqdn should be a good enough default value.

          class common ($master_host = $fqdn, $master_port = "7077", $master_ui_port = "18080") 
          

          I've uploaded a patch to fix the spark worker deployment. It works well on two nodes docker cluster.

          Show
          evans_ye Evans Ye added a comment - - edited The bug is fairly complicated. [~Konstantin Boudnik] spots most of them: absence of the setting of STANDALONE_SPARK_MASTER_HOST in the spark-env.sh export SPARK_MASTER_IP=<%= @master_port %> Another bug I found is that all the code in spark init.pp are lookning for $common::master_host , but it's $common::spark_master_host in cluster.yaml instead. The following is fine since the master_host is set to $fqdn only if $spark::common::master_host is missing, hence $fqdn should be a good enough default value. class common ($master_host = $fqdn, $master_port = "7077" , $master_ui_port = "18080" ) I've uploaded a patch to fix the spark worker deployment. It works well on two nodes docker cluster.
          Hide
          cos Konstantin Boudnik added a comment -

          Also, it seems that init.pp has a bug in this line:

            class common ($master_host = $fqdn, $master_port = "7077", $master_ui_port = "18080") {
          

          which will set master_host to a worker's hostname if the recipe is ran on non-master node. I believe the correct code should look like

            class common ($master_host = $common::master_host, $master_port = "7077", $master_ui_port = "18080") {
          

          YoungWoo Kim, could you please take a look at it? Thanks!

          Show
          cos Konstantin Boudnik added a comment - Also, it seems that init.pp has a bug in this line: class common ($master_host = $fqdn, $master_port = "7077" , $master_ui_port = "18080" ) { which will set master_host to a worker's hostname if the recipe is ran on non-master node. I believe the correct code should look like class common ($master_host = $common::master_host, $master_port = "7077" , $master_ui_port = "18080" ) { YoungWoo Kim , could you please take a look at it? Thanks!
          Hide
          cos Konstantin Boudnik added a comment - - edited

          Ah, indeed... here's the bug

          export SPARK_MASTER_IP=<%= @master_port %>
          

          should be using @master_host instead

          Show
          cos Konstantin Boudnik added a comment - - edited Ah, indeed... here's the bug export SPARK_MASTER_IP=<%= @master_port %> should be using @master_host instead
          Hide
          cos Konstantin Boudnik added a comment - - edited

          The problem is caused by the absence of the setting of STANDALONE_SPARK_MASTER_HOST in the spark-env.sh The variable has to be present for standalone deployment. Hint: not everyone cares to use YARN, really

          Looking more into the generated env file, I see the SPARK_MASTER_IP is set to the port number instead of the hostname. Looks like there's a deeper issue with puppet template or something?

          Show
          cos Konstantin Boudnik added a comment - - edited The problem is caused by the absence of the setting of STANDALONE_SPARK_MASTER_HOST in the spark-env.sh The variable has to be present for standalone deployment. Hint: not everyone cares to use YARN, really Looking more into the generated env file, I see the SPARK_MASTER_IP is set to the port number instead of the hostname. Looks like there's a deeper issue with puppet template or something?

            People

            • Assignee:
              evans_ye Evans Ye
              Reporter:
              cos Konstantin Boudnik
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development