Uploaded image for project: 'Bigtop'
  1. Bigtop
  2. BIGTOP-2764

deployment failure when roles include spark::common and spark::yarn*

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.2.0
    • Fix Version/s: 1.2.1
    • Component/s: deployment
    • Labels:
      None

      Description

      I use bigtop roles to control what gets deployed. When I include `spark-client` and `spark-on-yarn`, I get the following deployment failure:

      unit-spark-0: 15:15:24 INFO unit.spark/0.config-changed Error: Evaluation Error: Error while evaluating a Resource Statement, Duplicate declaration: Package[spark-datanucleus] is already declared in file /home/ubuntu/bigtop.release/bigtop-1.2.0/bigtop-deploy/puppet/modules/spark/manifests/init.pp:158; cannot redeclare at /home/ubuntu/bigtop.release/bigtop-1.2.0/bigtop-deploy/puppet/modules/spark/manifests/init.pp:132 at /home/ubuntu/bigtop.release/bigtop-1.2.0/bigtop-deploy/puppet/modules/spark/manifests/init.pp:132:5 on node machine-0.qnvvwf5bfm5uzgvdpofqo5megg.dx.internal.cloudapp.net
      

      This is because the spark-datanucleus package is defined for both the common class:

      https://github.com/apache/bigtop/blob/master/bigtop-deploy/puppet/modules/spark/manifests/init.pp#L158

      and the yarn classes:

      https://github.com/apache/bigtop/blob/master/bigtop-deploy/puppet/modules/spark/manifests/init.pp#L117

        Issue Links

          Activity

          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user kwmonroe closed the pull request at:

          https://github.com/apache/bigtop/pull/205

          Show
          githubbot ASF GitHub Bot added a comment - Github user kwmonroe closed the pull request at: https://github.com/apache/bigtop/pull/205
          Show
          githubbot ASF GitHub Bot added a comment - Github user kwmonroe commented on the issue: https://github.com/apache/bigtop/pull/205 Closed by https://github.com/apache/bigtop/commit/73864a30b29ee89655772ce82b4041c2564d0711
          Hide
          evans_ye Evans Ye added a comment -

          Committed and pushed.
          Thanks Kevin W Monroe.

          Show
          evans_ye Evans Ye added a comment - Committed and pushed. Thanks Kevin W Monroe .
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user evans-ye commented on the issue:

          https://github.com/apache/bigtop/pull/205

          Thanks for the review @ktsakalozos.
          +1 as well.

          Show
          githubbot ASF GitHub Bot added a comment - Github user evans-ye commented on the issue: https://github.com/apache/bigtop/pull/205 Thanks for the review @ktsakalozos. +1 as well.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user ktsakalozos commented on the issue:

          https://github.com/apache/bigtop/pull/205

          LGTM +1

          Show
          githubbot ASF GitHub Bot added a comment - Github user ktsakalozos commented on the issue: https://github.com/apache/bigtop/pull/205 LGTM +1
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user kwmonroe commented on the issue:

          https://github.com/apache/bigtop/pull/205

          [spark-39](https://jujucharms.com/spark/39) contains this fix, and is green across clouds for hadoop + spark in yarn mode:

          http://bigtop.charm.qa/cwr_bundle_hadoop_spark/33/report.html

          Show
          githubbot ASF GitHub Bot added a comment - Github user kwmonroe commented on the issue: https://github.com/apache/bigtop/pull/205 [spark-39] ( https://jujucharms.com/spark/39 ) contains this fix, and is green across clouds for hadoop + spark in yarn mode: http://bigtop.charm.qa/cwr_bundle_hadoop_spark/33/report.html
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user kwmonroe commented on the issue:

          https://github.com/apache/bigtop/pull/205

          I've updated the original workaround to include `spark::datanucleus` instead of redefining the package. This is much cleaner than including multiple comment blocks to explain the 2268 hack.

          Show
          githubbot ASF GitHub Bot added a comment - Github user kwmonroe commented on the issue: https://github.com/apache/bigtop/pull/205 I've updated the original workaround to include `spark::datanucleus` instead of redefining the package. This is much cleaner than including multiple comment blocks to explain the 2268 hack.
          Hide
          evans_ye Evans Ye added a comment - - edited

          Commented in https://github.com/apache/bigtop/pull/208.
          Would love to see your PR here for me to rebase

          Show
          evans_ye Evans Ye added a comment - - edited Commented in https://github.com/apache/bigtop/pull/208 . Would love to see your PR here for me to rebase
          Hide
          kwmonroe Kevin W Monroe added a comment -

          +1 on including the class vs redefining the package. This has the added benefit of not needing giant comment blocks that document our hacks

          I left a comment in https://github.com/apache/bigtop/pull/208; if you'd like to make this change as part of your BIGTOP-2766 PR, I'm +1 and will close this as a dupe. Otherwise, I can make it once 2766 lands. Thanks!

          Show
          kwmonroe Kevin W Monroe added a comment - +1 on including the class vs redefining the package. This has the added benefit of not needing giant comment blocks that document our hacks I left a comment in https://github.com/apache/bigtop/pull/208 ; if you'd like to make this change as part of your BIGTOP-2766 PR, I'm +1 and will close this as a dupe. Otherwise, I can make it once 2766 lands. Thanks!
          Hide
          evans_ye Evans Ye added a comment -

          Thanks. I got one question. Is it possible to just include the datanucleus class to avoid conflict?
          Seems to work based on my test.

          Show
          evans_ye Evans Ye added a comment - Thanks. I got one question. Is it possible to just include the datanucleus class to avoid conflict? Seems to work based on my test.
          Hide
          kwmonroe Kevin W Monroe added a comment -

          Not too pretty Evans Ye, but the linked PR implements option 2 from my earlier comment.

          Tested locally and looks good. I'll spin up a couple spark clusters in standalone and yarn mode to test more thoroughly.

          Show
          kwmonroe Kevin W Monroe added a comment - Not too pretty Evans Ye , but the linked PR implements option 2 from my earlier comment. Tested locally and looks good. I'll spin up a couple spark clusters in standalone and yarn mode to test more thoroughly.
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user kwmonroe opened a pull request:

          https://github.com/apache/bigtop/pull/205

          BIGTOP-2764: deployment failure when roles include spark::common and spark::yarn*

          We need to ensure the `spark-datanucleus` package is only defined once. There's a hack in place so it is always defined in `spark::common` (see BIGTOP-2268):

          https://github.com/apache/bigtop/blob/master/bigtop-deploy/puppet/modules/spark/manifests/init.pp#L154

          This PR comments out the other definition, which should be reinstated if/when 2268 is fixed.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/juju-solutions/bigtop bug/BIGTOP-2764/duplicate-datanucleus

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/bigtop/pull/205.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #205


          commit 28e001443a3aaf2db390496e35763eaa7b94926f
          Author: Kevin W Monroe <kevin.monroe@canonical.com>
          Date: 2017-05-10T16:32:24Z

          BIGTOP-2764: remove duplicate spark-datanucleus definition


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user kwmonroe opened a pull request: https://github.com/apache/bigtop/pull/205 BIGTOP-2764 : deployment failure when roles include spark::common and spark::yarn* We need to ensure the `spark-datanucleus` package is only defined once. There's a hack in place so it is always defined in `spark::common` (see BIGTOP-2268 ): https://github.com/apache/bigtop/blob/master/bigtop-deploy/puppet/modules/spark/manifests/init.pp#L154 This PR comments out the other definition, which should be reinstated if/when 2268 is fixed. You can merge this pull request into a Git repository by running: $ git pull https://github.com/juju-solutions/bigtop bug/ BIGTOP-2764 /duplicate-datanucleus Alternatively you can review and apply these changes as the patch at: https://github.com/apache/bigtop/pull/205.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #205 commit 28e001443a3aaf2db390496e35763eaa7b94926f Author: Kevin W Monroe <kevin.monroe@canonical.com> Date: 2017-05-10T16:32:24Z BIGTOP-2764 : remove duplicate spark-datanucleus definition
          Hide
          evans_ye Evans Ye added a comment -

          Kevin W Monroe are you WIP on this?
          I have a related work on fixing the spark standalone deployment. Currently Spark Worker will fail due to master_url is always set to yarn.
          Would love to see your patch. But if you're busy. I can submit one.

          Show
          evans_ye Evans Ye added a comment - Kevin W Monroe are you WIP on this? I have a related work on fixing the spark standalone deployment. Currently Spark Worker will fail due to master_url is always set to yarn. Would love to see your patch. But if you're busy. I can submit one.
          Hide
          kwmonroe Kevin W Monroe added a comment -

          This is caused by the unfortunate requirement to include datanucleus for all spark deployments (see BIGTOP-2268).

          My proposal is to one of the following:

          1. spend some time on BIGTOP-2268 so hive/datanucleus is properly untangled from spark, and remove the hack.
          2. temporarily remove the datanucleus inclusion from yarn and yarn_slave since we know datanucleus will already be provided by the common class.

          For reference, here's my site.yaml that triggers this failure:

          $ cat ./bigtop.release/bigtop-1.2.0/bigtop-deploy/puppet/hieradata/site.yaml
          # Juju manages this file. Modifications may be overwritten!
          bigtop::bigtop_repo_uri: http://bigtop-repos.s3.amazonaws.com/releases/1.2.0/ubuntu/16.04/x86_64
          bigtop::hadoop_head_node: machine-1.qnvvwf5bfm5uzgvdpofqo5megg.dx.internal.cloudapp.net
          bigtop::jdk_preinstalled: true
          bigtop::roles:
          - hadoop-client
          - spark-client
          - spark-history-server
          - spark-master
          - spark-on-yarn
          - spark-worker
          - spark-yarn-slave
          bigtop::roles_enabled: true
          hadoop::common_hdfs::hadoop_namenode_host: machine-1.qnvvwf5bfm5uzgvdpofqo5megg.dx.internal.cloudapp.net
          hadoop::common_hdfs::namenode_datanode_registration_ip_hostname_check: false
          hadoop::common_mapred_app::jobtracker_host: machine-1.qnvvwf5bfm5uzgvdpofqo5megg.dx.internal.cloudapp.net
          hadoop::common_mapred_app::mapreduce_jobhistory_host: machine-1.qnvvwf5bfm5uzgvdpofqo5megg.dx.internal.cloudapp.net
          hadoop::common_yarn::hadoop_ps_host: machine-1.qnvvwf5bfm5uzgvdpofqo5megg.dx.internal.cloudapp.net
          hadoop::common_yarn::hadoop_rm_host: machine-1.qnvvwf5bfm5uzgvdpofqo5megg.dx.internal.cloudapp.net
          hadoop::common_yarn::yarn_nodemanager_vmem_check_enabled: false
          hadoop::hadoop_storage_dirs:
          - /data/1
          - /data/2
          spark::common::event_log_dir: hdfs:///var/log/spark/apps
          spark::common::history_log_dir: hdfs:///var/log/spark/apps
          spark::common::master_host: 192.168.0.4
          spark::common::master_url: yarn-client
          spark::common::zookeeper_connection_string: null
          
          Show
          kwmonroe Kevin W Monroe added a comment - This is caused by the unfortunate requirement to include datanucleus for all spark deployments (see BIGTOP-2268 ). My proposal is to one of the following: spend some time on BIGTOP-2268 so hive/datanucleus is properly untangled from spark, and remove the hack . temporarily remove the datanucleus inclusion from yarn and yarn_slave since we know datanucleus will already be provided by the common class. For reference, here's my site.yaml that triggers this failure: $ cat ./bigtop.release/bigtop-1.2.0/bigtop-deploy/puppet/hieradata/site.yaml # Juju manages this file. Modifications may be overwritten! bigtop::bigtop_repo_uri: http: //bigtop-repos.s3.amazonaws.com/releases/1.2.0/ubuntu/16.04/x86_64 bigtop::hadoop_head_node: machine-1.qnvvwf5bfm5uzgvdpofqo5megg.dx.internal.cloudapp.net bigtop::jdk_preinstalled: true bigtop::roles: - hadoop-client - spark-client - spark-history-server - spark-master - spark-on-yarn - spark-worker - spark-yarn-slave bigtop::roles_enabled: true hadoop::common_hdfs::hadoop_namenode_host: machine-1.qnvvwf5bfm5uzgvdpofqo5megg.dx.internal.cloudapp.net hadoop::common_hdfs::namenode_datanode_registration_ip_hostname_check: false hadoop::common_mapred_app::jobtracker_host: machine-1.qnvvwf5bfm5uzgvdpofqo5megg.dx.internal.cloudapp.net hadoop::common_mapred_app::mapreduce_jobhistory_host: machine-1.qnvvwf5bfm5uzgvdpofqo5megg.dx.internal.cloudapp.net hadoop::common_yarn::hadoop_ps_host: machine-1.qnvvwf5bfm5uzgvdpofqo5megg.dx.internal.cloudapp.net hadoop::common_yarn::hadoop_rm_host: machine-1.qnvvwf5bfm5uzgvdpofqo5megg.dx.internal.cloudapp.net hadoop::common_yarn::yarn_nodemanager_vmem_check_enabled: false hadoop::hadoop_storage_dirs: - /data/1 - /data/2 spark::common::event_log_dir: hdfs: /// var /log/spark/apps spark::common::history_log_dir: hdfs: /// var /log/spark/apps spark::common::master_host: 192.168.0.4 spark::common::master_url: yarn-client spark::common::zookeeper_connection_string: null

            People

            • Assignee:
              kwmonroe Kevin W Monroe
              Reporter:
              kwmonroe Kevin W Monroe
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development