Uploaded image for project: 'Bigtop'
  1. Bigtop
  2. BIGTOP-1634

Puppet class parameter and hiera conversion

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: backlog
    • Fix Version/s: 1.0.0
    • Component/s: deployment
    • Labels:

      Description

      As discussed on the DEV list:
      Update the puppet code to use self-contained, parametrised classes and proper scoping. Replace all extlookup calls bei either explicit or automatic hiera parameter lookups. Implement HA/non-HA alternative via hiera lookup hierarchy. Replace append_each from bigtop_util by suffix from stdlib. Do file imports via puppet:/// scheme. Remove bigtop_util because remaining function get_settings is not needed any more.

      Additionally: Add additional configuration options for zookeeper and yarn as well as a new class for journalnode configuration.

      I've separated it into two patches of ease of review:
      0001: actual hiera/class conversion
      0002: functional enhancements including journalnode configuration on top of that. Mainly meant as an example for ease of further expansion and containment of changes to single modules.

      This JIRA is meant for the stuff contained in 0001, the actual hiera conversion. I can resubmit 0002 as a separate JIRA if desired. Also it should be possible to backport 0002 to the current puppet code base without much fuss.

        Activity

        Hide
        evans_ye Evans Ye added a comment -

        Committed. Big thanks to [~michael weiser]. This is a great improvement to our puppet recipes, and I'm looking forward to the completion of the rest parts

        Show
        evans_ye Evans Ye added a comment - Committed. Big thanks to [~michael weiser] . This is a great improvement to our puppet recipes, and I'm looking forward to the completion of the rest parts
        Hide
        evans_ye Evans Ye added a comment -

        Okay, tested and works. +1 on the patch and I'll commit shortly.
        After this is in we can move forward to improve features and fix know issues w/o being blocked. I'll first update both vm and docker provisioner to reflect to the change so that we can have testing environment for further puppet recipes updates.

        Show
        evans_ye Evans Ye added a comment - Okay, tested and works. +1 on the patch and I'll commit shortly. After this is in we can move forward to improve features and fix know issues w/o being blocked. I'll first update both vm and docker provisioner to reflect to the change so that we can have testing environment for further puppet recipes updates.
        Hide
        evans_ye Evans Ye added a comment -

        Thank for the update Michael Weiser. I think we're on the same page now. I'll do the final test on the newest patch and then commit this if no one stands for objection.

        Show
        evans_ye Evans Ye added a comment - Thank for the update Michael Weiser . I think we're on the same page now. I'll do the final test on the newest patch and then commit this if no one stands for objection.
        Hide
        michaelweiser Michael Weiser added a comment -

        corrected updated patch relative to HEAD + BIGTOP-1650 + BIGTOP-1651

        Show
        michaelweiser Michael Weiser added a comment - corrected updated patch relative to HEAD + BIGTOP-1650 + BIGTOP-1651
        Hide
        michaelweiser Michael Weiser added a comment -

        Updated patch relative to HEAD + BIGTOP-1650 + BIGTOP-1651

        Show
        michaelweiser Michael Weiser added a comment - Updated patch relative to HEAD + BIGTOP-1650 + BIGTOP-1651
        Hide
        michaelweiser Michael Weiser added a comment -

        once you complete the work.

        This patch includes the journalnode class in module hadoop. I've got that working in our setup and it seems to do all that is necessary. I had planned to put together an example of how we do node role assignment in our setup and how cluster.pp could be reorganized to do the same. And I would certainly like to clean up the identifier mess I've created (dropping hadoop_ prefixes where they're not necessary). But that's for other JIRAs.

        different array reputation on yamls

        Ah, right, I forgot: site.csv's

        hadoop_data_dirs,/data/1,/data/2,/data/3

        can be written as

        hadoop::common_hdfs::hadoop_data_dirs: [ '/data/1', '/data/2', '/data/3' ]

        which would still allow for one-liners in setup scripts. As long as there's no spaces in the paths even

        hadoop::common_hdfs::hadoop_data_dirs: [/data/1,/data/2,/data/3]

        works which is as site.csvish as it can get.

        Show
        michaelweiser Michael Weiser added a comment - once you complete the work. This patch includes the journalnode class in module hadoop. I've got that working in our setup and it seems to do all that is necessary. I had planned to put together an example of how we do node role assignment in our setup and how cluster.pp could be reorganized to do the same. And I would certainly like to clean up the identifier mess I've created (dropping hadoop_ prefixes where they're not necessary). But that's for other JIRAs. different array reputation on yamls Ah, right, I forgot: site.csv's hadoop_data_dirs,/data/1,/data/2,/data/3 can be written as hadoop::common_hdfs::hadoop_data_dirs: [ '/data/1', '/data/2', '/data/3' ] which would still allow for one-liners in setup scripts. As long as there's no spaces in the paths even hadoop::common_hdfs::hadoop_data_dirs: [/data/1,/data/2,/data/3] works which is as site.csvish as it can get.
        Hide
        evans_ye Evans Ye added a comment -

        I should be around for triaging fallout of this change for at least the next six months. I can not, however, promise continued involvement in Bigtop

        That's fine. I can take the maintainer role of the journalnode stuff once you complete the work. To me I'd like to avoid half implement feature w/o explicitly support on it, so if you can get the feature to the finish line, then I personally would be happy to +1 and commit this.

        It'd basically be a step backwards in terms of elegance and reduction of complexity

        I see, and I agree with you. I've tried a different array reputation on yamls for vm and docker provisioner. I think it should work with only slightly updates. So let's stick to the current approach and I'll fire a jira to update both provisioners once this is in.

        Show
        evans_ye Evans Ye added a comment - I should be around for triaging fallout of this change for at least the next six months. I can not, however, promise continued involvement in Bigtop That's fine. I can take the maintainer role of the journalnode stuff once you complete the work. To me I'd like to avoid half implement feature w/o explicitly support on it, so if you can get the feature to the finish line, then I personally would be happy to +1 and commit this. It'd basically be a step backwards in terms of elegance and reduction of complexity I see, and I agree with you. I've tried a different array reputation on yamls for vm and docker provisioner. I think it should work with only slightly updates. So let's stick to the current approach and I'll fire a jira to update both provisioners once this is in.
        Hide
        michaelweiser Michael Weiser added a comment -

        While journaling of HDFS is the new feature potentially will be added.

        As said at the very top: Journalling can certainly be split out into a separate jira to trim down the size of this change.

        Would you like to get on the boat of maintainers?

        I should be around for triaging fallout of this change for at least the next six months. I can not, however, promise continued involvement in Bigtop since the project we're doing has a limited time frame after which I'll most likely have to move on to other things.

        Can we retain the logic to support comma separated string

        That's certainly possible. We'd have to add code for all arrays to check if they're comma-separated strings and split them at the commas. The downside is that it would add quite some code or a custom function because up to now all the comma-separated lists in site.csv were actually just extlookup()'s way of representing arrays and would end up as arrays in puppet (with the exception of single-item lists). So it'd basically be a step backwards in terms of elegance and reduction of complexity to support the native format of extlookup for expressing arrays in yaml which already has a syntax for exactly the same thing.

        Could we perhaps just supply a small script for converting site.csvs into site.yamls or generating site.yamls from command line parameters? It should be trivial and could be added as a one-liner to vagrant and docker scripts.

        Show
        michaelweiser Michael Weiser added a comment - While journaling of HDFS is the new feature potentially will be added. As said at the very top: Journalling can certainly be split out into a separate jira to trim down the size of this change. Would you like to get on the boat of maintainers? I should be around for triaging fallout of this change for at least the next six months. I can not, however, promise continued involvement in Bigtop since the project we're doing has a limited time frame after which I'll most likely have to move on to other things. Can we retain the logic to support comma separated string That's certainly possible. We'd have to add code for all arrays to check if they're comma-separated strings and split them at the commas. The downside is that it would add quite some code or a custom function because up to now all the comma-separated lists in site.csv were actually just extlookup()'s way of representing arrays and would end up as arrays in puppet (with the exception of single-item lists). So it'd basically be a step backwards in terms of elegance and reduction of complexity to support the native format of extlookup for expressing arrays in yaml which already has a syntax for exactly the same thing. Could we perhaps just supply a small script for converting site.csvs into site.yamls or generating site.yamls from command line parameters? It should be trivial and could be added as a one-liner to vagrant and docker scripts.
        Hide
        evans_ye Evans Ye added a comment - - edited

        Hey jay, thanks for stepping in.

        are the new features in this puppet file maintained by someone (for example I see kerberos. Do we support that? journalling of HDFS. Do need/we have maintainers for those?

        Well, kerberos is a thing we already have, so there's no changes introduced in this jira except the naming(scope) of parameters for hiera. While journaling of HDFS is the new feature potentially will be added. Michael Weiser I see you're continuously working on this in several jiras. Would you like to get on the boat of maintainers? In that way we can check in the journalnodes preparations in this jira w/o hesitation.

        does the existing changes passing the existing vagrant-docker or vm tests ?

        Good point! Can we retain the logic to support comma separated string (hadoop,yarn,hbase) for components or other configurations? That will make the vagrant-docker or vm tests compatible to this w/ only slightly changes.

        Show
        evans_ye Evans Ye added a comment - - edited Hey jay, thanks for stepping in. are the new features in this puppet file maintained by someone (for example I see kerberos. Do we support that? journalling of HDFS. Do need/we have maintainers for those? Well, kerberos is a thing we already have, so there's no changes introduced in this jira except the naming(scope) of parameters for hiera. While journaling of HDFS is the new feature potentially will be added. Michael Weiser I see you're continuously working on this in several jiras. Would you like to get on the boat of maintainers? In that way we can check in the journalnodes preparations in this jira w/o hesitation. does the existing changes passing the existing vagrant-docker or vm tests ? Good point! Can we retain the logic to support comma separated string (hadoop,yarn,hbase) for components or other configurations? That will make the vagrant-docker or vm tests compatible to this w/ only slightly changes.
        Hide
        jayunit100 jay vyas added a comment - - edited

        Hi evans. I think the criteria for commit are simple.

        • are the new features in this puppet file maintained by someone (for example I see kerberos. Do we support that? journalling of HDFS. Do need/we have maintainers for those?
        • does the existing changes passing the existing vagrant-docker or vm tests ?

        If so, and if the code passes your approval, then go ahead and commit !

        Show
        jayunit100 jay vyas added a comment - - edited Hi evans. I think the criteria for commit are simple. are the new features in this puppet file maintained by someone (for example I see kerberos . Do we support that? journalling of HDFS. Do need/we have maintainers for those? does the existing changes passing the existing vagrant-docker or vm tests ? If so, and if the code passes your approval, then go ahead and commit !
        Hide
        evans_ye Evans Ye added a comment -

        OK, here're the test results

        • The deploy of kerberosized cluster failed, but both new hiera and old extlookup puppet recipes failed to setup a kerberos cluster, so there might be a problem exist before applying this. I'll take a look in another jira.
        • Some components failed to deploy:
          hcatalog-server
          Debug: Executing '/usr/bin/yum -d 0 -e 0 -y list hcatalog-server'
          Error: Could not update: Execution of '/usr/bin/yum -d 0 -e 0 -y list hcatalog-server' returned 1: Error: No matching Packages to list
          Wrapped exception:
          Execution of '/usr/bin/yum -d 0 -e 0 -y list hcatalog-server' returned 1: Error: No matching Packages to list
          Error: /Stage[main]/Hcatalog::Server/Package[hcatalog-server]/ensure: change from absent to latest failed: Could not update: Execution of '/usr/bin/yum -d 0 -e 0 -y list hcatalog-server' returned 1: Error: No matching Packages to list
          

          webhcat-server

          Debug: Executing '/usr/bin/yum -d 0 -e 0 -y list webhcat-server'
          Error: Could not update: Execution of '/usr/bin/yum -d 0 -e 0 -y list webhcat-server' returned 1: Error: No matching Packages to list
          Wrapped exception:
          Execution of '/usr/bin/yum -d 0 -e 0 -y list webhcat-server' returned 1: Error: No matching Packages to list
          Error: /Stage[main]/Hcatalog::Webhcat::Server/Package[webhcat-server]/ensure: change from absent to latest failed: Could not update: Execution of '/usr/bin/yum -d 0 -e 0 -y list webhcat-server' returned 1: Error: No matching Packages to list
          

          Solr

          Debug: Executing '/bin/bash -c '/usr/bin/solrctl debug-dump | grep -q solr.xml || /usr/bin/solrctl init''
          Notice: /Stage[main]/Solr::Server/Exec[solr init]/returns: Error: failed to initialize Solr
          Error: /bin/bash -c '/usr/bin/solrctl debug-dump | grep -q solr.xml || /usr/bin/solrctl init' returned 1 instead of one of [0]
          Error: /Stage[main]/Solr::Server/Exec[solr init]/returns: change from notrun to 0 failed: /bin/bash -c '/usr/bin/solrctl debug-dump | grep -q solr.xml || /usr/bin/solrctl init' returned 1 instead of one of [0]
          

          hcatalog-server and webhcat-server should be missing a leading hive-, and solr needs to be fixed as well, but all of them are not introduced by this jira.

        To sum up, all the failures are causing by issues already exist, other than that, this jira works well on deploying a bigtop cluster.
        I think this jira is good enough to get in, so +1 by me, but since this is a big change to our puppet recipes, it would be better to have another committer/maintainer to agree with it. Maybe jay vyas can help?

        Show
        evans_ye Evans Ye added a comment - OK, here're the test results The deploy of kerberosized cluster failed, but both new hiera and old extlookup puppet recipes failed to setup a kerberos cluster, so there might be a problem exist before applying this. I'll take a look in another jira. Some components failed to deploy: hcatalog-server Debug: Executing '/usr/bin/yum -d 0 -e 0 -y list hcatalog-server' Error: Could not update: Execution of '/usr/bin/yum -d 0 -e 0 -y list hcatalog-server' returned 1: Error: No matching Packages to list Wrapped exception: Execution of '/usr/bin/yum -d 0 -e 0 -y list hcatalog-server' returned 1: Error: No matching Packages to list Error: /Stage[main]/Hcatalog::Server/Package[hcatalog-server]/ensure: change from absent to latest failed: Could not update: Execution of '/usr/bin/yum -d 0 -e 0 -y list hcatalog-server' returned 1: Error: No matching Packages to list webhcat-server Debug: Executing '/usr/bin/yum -d 0 -e 0 -y list webhcat-server' Error: Could not update: Execution of '/usr/bin/yum -d 0 -e 0 -y list webhcat-server' returned 1: Error: No matching Packages to list Wrapped exception: Execution of '/usr/bin/yum -d 0 -e 0 -y list webhcat-server' returned 1: Error: No matching Packages to list Error: /Stage[main]/Hcatalog::Webhcat::Server/Package[webhcat-server]/ensure: change from absent to latest failed: Could not update: Execution of '/usr/bin/yum -d 0 -e 0 -y list webhcat-server' returned 1: Error: No matching Packages to list Solr Debug: Executing '/bin/bash -c '/usr/bin/solrctl debug-dump | grep -q solr.xml || /usr/bin/solrctl init'' Notice: /Stage[main]/Solr::Server/Exec[solr init]/returns: Error: failed to initialize Solr Error: /bin/bash -c '/usr/bin/solrctl debug-dump | grep -q solr.xml || /usr/bin/solrctl init' returned 1 instead of one of [0] Error: /Stage[main]/Solr::Server/Exec[solr init]/returns: change from notrun to 0 failed: /bin/bash -c '/usr/bin/solrctl debug-dump | grep -q solr.xml || /usr/bin/solrctl init' returned 1 instead of one of [0] hcatalog-server and webhcat-server should be missing a leading hive- , and solr needs to be fixed as well, but all of them are not introduced by this jira. To sum up, all the failures are causing by issues already exist, other than that, this jira works well on deploying a bigtop cluster. I think this jira is good enough to get in, so +1 by me, but since this is a big change to our puppet recipes, it would be better to have another committer/maintainer to agree with it. Maybe jay vyas can help?
        Hide
        evans_ye Evans Ye added a comment -

        Hey Michael Weiser I got your point now. So topology settings and journal based HA should be the next steps after we get this in, correct? For the journalnode storage logic, I think its better to configure a different directory, but we can address this when we're actually implementing the journalnode logic.
        Thanks for fixing the tachyon deployment, let me deploy all components on my side and see what's going on for others you mentioned.
        I've applied your patch and now I can successfully deploy a cluster of hadoop and yarn. It seems we're getting close to the commit! Now, trys to enable kerberos. Will let you know the result

        Show
        evans_ye Evans Ye added a comment - Hey Michael Weiser I got your point now. So topology settings and journal based HA should be the next steps after we get this in, correct? For the journalnode storage logic, I think its better to configure a different directory, but we can address this when we're actually implementing the journalnode logic. Thanks for fixing the tachyon deployment, let me deploy all components on my side and see what's going on for others you mentioned. I've applied your patch and now I can successfully deploy a cluster of hadoop and yarn. It seems we're getting close to the commit! Now, trys to enable kerberos. Will let you know the result
        Hide
        rnp Richard Pelavin added a comment -

        I am focusing on testing with puppet apply

        Show
        rnp Richard Pelavin added a comment - I am focusing on testing with puppet apply
        Hide
        michaelweiser Michael Weiser added a comment -

        make $first_namenode work once more

        Show
        michaelweiser Michael Weiser added a comment - make $first_namenode work once more
        Hide
        michaelweiser Michael Weiser added a comment -

        right patch file

        Show
        michaelweiser Michael Weiser added a comment - right patch file
        Hide
        michaelweiser Michael Weiser added a comment -
        Error: suffix(): expected first argument to be an Array, got "/data/1" at /bigtop-home/bigtop-deploy/puppet/modules/hadoop/manifests/init.pp:149 on node bigtop1.docker

        What I really meant to say was:

        $journalnode_edits_dir = "${hadoop::hadoop_storage_dirs[0]}/journalnode",

        meaning "put journalnode edits dir in first hadoop storage dir by default". No idea if that's a good default because we're using a different directory layout in our setup and everything is overridden directly from hiera. No idea why that error didn't crop up in my tests with the default setup either. Sorry again. Now it did. New patch attached.

        How are you testing the journalnode stuff? I haven't put any of the necessary glue code into cluster.pp because we're not using cluster.pp and the journalnode stuff is bleeding edge on our end as well. I guess it's not much use to pop a component "journalnode" in there because the whole concept of master and standby namenode doesn't apply in that case, does it?

        However, if I do not specify components in config, hence deploying all

        That doesn't work for me either, but for a different reason: The package names in hcatalog are wrong, they're missing a leading hive-, I think. Tachyon also fails later on missing its init script. And finally solr doesn't start for whatever reason.

        Error: Could not update: Execution of '/usr/bin/apt-get -q -y -o DPkg::Options::=--force-confold install hcatalog-server' returned 100: Reading package lists...
        Error: /Stage[main]/Tachyon::Master/Service[tachyon-master]: Could not evaluate: Could not find init script for 'tachyon-master'
        

        I have reproduced those with the stock puppet recipes. So as far as I can tell they're not regressions. Unfortunately we're using neither component so I've got no expertise to track down what's going on with them.

        With all but those three components enabled, puppet agent -t runs all the way through for me though.

        Error: Could not find resource 'Service[tachyon-master]' for relationship on 'Service[tachyon-worker]' on node bigtop1.docker

        sigh It must be the third time, I've fixed that stupid scoping bug in the tachyon module. It seems to go missing all the time somehow. It's fixed in the attached patch.

        Also probably useful to mention if there is any version requirement on stdlib

        stdlib already was required. It came in with commit 82922353d2ee80bd08a4e9edad552b44fe0e38be by jay vyas (BIGTOP-1553). It introduced any2array to split the components list. That's still used. The only function I use is suffix to replace append_each from bigtop_util. Both any2array and suffix seem to have been introduced with stdlib 4.0.0. Long story short: The modules had and have a dependency on stdlib >= 4.0.0. Updated the README in the attached patch accordingly.

        Another thing to look out for: I'm testing all this stuff in a puppet master/agent setup, not with puppet apply. I only noticed that discrepancy when updating the README - which obviously I did last. ;-/ Also, I have changed file imports to puppet:/// URIs which should work with apply but I haven't tested it. Is it working for you so far? (It's only used for the ssh keys, I think.)

        Show
        michaelweiser Michael Weiser added a comment - Error: suffix(): expected first argument to be an Array, got "/data/1" at /bigtop-home/bigtop-deploy/puppet/modules/hadoop/manifests/init.pp:149 on node bigtop1.docker What I really meant to say was: $journalnode_edits_dir = "${hadoop::hadoop_storage_dirs[0]}/journalnode", meaning "put journalnode edits dir in first hadoop storage dir by default". No idea if that's a good default because we're using a different directory layout in our setup and everything is overridden directly from hiera. No idea why that error didn't crop up in my tests with the default setup either. Sorry again. Now it did. New patch attached. How are you testing the journalnode stuff? I haven't put any of the necessary glue code into cluster.pp because we're not using cluster.pp and the journalnode stuff is bleeding edge on our end as well. I guess it's not much use to pop a component "journalnode" in there because the whole concept of master and standby namenode doesn't apply in that case, does it? However, if I do not specify components in config, hence deploying all That doesn't work for me either, but for a different reason: The package names in hcatalog are wrong, they're missing a leading hive-, I think. Tachyon also fails later on missing its init script. And finally solr doesn't start for whatever reason. Error: Could not update: Execution of '/usr/bin/apt-get -q -y -o DPkg::Options::=--force-confold install hcatalog-server' returned 100: Reading package lists... Error: /Stage[main]/Tachyon::Master/Service[tachyon-master]: Could not evaluate: Could not find init script for 'tachyon-master' I have reproduced those with the stock puppet recipes. So as far as I can tell they're not regressions. Unfortunately we're using neither component so I've got no expertise to track down what's going on with them. With all but those three components enabled, puppet agent -t runs all the way through for me though. Error: Could not find resource 'Service[tachyon-master]' for relationship on 'Service[tachyon-worker]' on node bigtop1.docker sigh It must be the third time, I've fixed that stupid scoping bug in the tachyon module. It seems to go missing all the time somehow. It's fixed in the attached patch. Also probably useful to mention if there is any version requirement on stdlib stdlib already was required. It came in with commit 82922353d2ee80bd08a4e9edad552b44fe0e38be by jay vyas ( BIGTOP-1553 ). It introduced any2array to split the components list. That's still used. The only function I use is suffix to replace append_each from bigtop_util. Both any2array and suffix seem to have been introduced with stdlib 4.0.0. Long story short: The modules had and have a dependency on stdlib >= 4.0.0. Updated the README in the attached patch accordingly. Another thing to look out for: I'm testing all this stuff in a puppet master/agent setup, not with puppet apply. I only noticed that discrepancy when updating the README - which obviously I did last. ;-/ Also, I have changed file imports to puppet:/// URIs which should work with apply but I haven't tested it. Is it working for you so far? (It's only used for the ssh keys, I think.)
        Hide
        rnp Richard Pelavin added a comment -

        Ran into same problem that Evans reported; also was not sure of semantics so fixed syntactically; the way I fixed in my local copy: was

        1. git diff
          diff --git a/bigtop-deploy/puppet/modules/hadoop/manifests/init.pp b/bigtop-deploy/puppet/modules/hadoop/manifests/init.pp
          index 4035529..2ce49fb 100644
            • a/bigtop-deploy/puppet/modules/hadoop/manifests/init.pp
              +++ b/bigtop-deploy/puppet/modules/hadoop/manifests/init.pp
              @@ -146,7 +146,7 @@ class hadoop ($hadoop_security_authentication = "simple",
              $journalnode_port = "8485",
              $journalnode_http_port = "8480",
              $journalnode_https_port = "8481",
        • $journalnode_edits_dir = suffix($hadoop::hadoop_storage_dirs[0], "/journalnode"),
          + $journalnode_edits_dir = suffix([$hadoop::hadoop_storage_dirs[0]], "/journalnode"),
          $shared_edits_dir = "/hdfs_shared",
          $testonly_hdfs_sshkeys = "no",
          $hadoop_ha_sshfence_user_home = "/var/lib/hadoop-hdfs",

        Also probably useful to mention if there is any version requirement on stdlib

        Show
        rnp Richard Pelavin added a comment - Ran into same problem that Evans reported; also was not sure of semantics so fixed syntactically; the way I fixed in my local copy: was git diff diff --git a/bigtop-deploy/puppet/modules/hadoop/manifests/init.pp b/bigtop-deploy/puppet/modules/hadoop/manifests/init.pp index 4035529..2ce49fb 100644 a/bigtop-deploy/puppet/modules/hadoop/manifests/init.pp +++ b/bigtop-deploy/puppet/modules/hadoop/manifests/init.pp @@ -146,7 +146,7 @@ class hadoop ($hadoop_security_authentication = "simple", $journalnode_port = "8485", $journalnode_http_port = "8480", $journalnode_https_port = "8481", $journalnode_edits_dir = suffix($hadoop::hadoop_storage_dirs [0] , "/journalnode"), + $journalnode_edits_dir = suffix([$hadoop::hadoop_storage_dirs [0] ], "/journalnode"), $shared_edits_dir = "/hdfs_shared", $testonly_hdfs_sshkeys = "no", $hadoop_ha_sshfence_user_home = "/var/lib/hadoop-hdfs", Also probably useful to mention if there is any version requirement on stdlib
        Hide
        evans_ye Evans Ye added a comment -

        No prob Michael Weiser. Let's work together to get this in.
        I forgot to mention that there was a change I made to make it works. Here's the original error I got:

        Error: suffix(): expected first argument to be an Array, got "/data/1" at /bigtop-home/bigtop-deploy/puppet/modules/hadoop/manifests/init.pp:149 on node bigtop1.docker
        

        I updated $hadoop::hadoop_storage_dirs[0] to $hadoop::hadoop_storage_dirs, then it works.
        But I don't know if this is the logic you want, maybe you can confirm with me.
        I can successfully reach the end of puppet apply w/o errors now.
        Here's my /etc/puppet/hieradata/site.yaml in case someone is also interesting in this:

        bigtop::hadoop_head_node: "bigtop1.docker"
        hadoop::hadoop_storage_dirs:
        - "/data/1"
        - "/data/2"
        Bigtop::bigtop_yumrepo_uri: "http://bigtop01.cloudera.org:8080/view/Releases/job/Bigtop-0.8.0/label=centos6/6/artifact/output/"
        hadoop_cluster_node::cluster_components:
          - hadoop
          - yarn
        bigtop::jdk_package_name: "java-1.7.0-openjdk-devel.x86_64"
        

        However, if I do not specify components in config, hence deploying all, I got the following error:

        Error: Could not find resource 'Service[tachyon-master]' for relationship on 'Service[tachyon-worker]' on node bigtop1.docker
        

        I'm not sure whether that is related to this patch. I'll look into this, or maybe jay vyas can take a glance on it

        Getting back to this patch, I'll go further to test the journal based HA. This is the huge improvement we'd like to have, making me really excited about trying it

        Show
        evans_ye Evans Ye added a comment - No prob Michael Weiser . Let's work together to get this in. I forgot to mention that there was a change I made to make it works. Here's the original error I got: Error: suffix(): expected first argument to be an Array, got "/data/1" at /bigtop-home/bigtop-deploy/puppet/modules/hadoop/manifests/init.pp:149 on node bigtop1.docker I updated $hadoop::hadoop_storage_dirs [0] to $hadoop::hadoop_storage_dirs , then it works. But I don't know if this is the logic you want, maybe you can confirm with me. I can successfully reach the end of puppet apply w/o errors now. Here's my /etc/puppet/hieradata/site.yaml in case someone is also interesting in this: bigtop::hadoop_head_node: "bigtop1.docker" hadoop::hadoop_storage_dirs: - "/data/1" - "/data/2" Bigtop::bigtop_yumrepo_uri: "http: //bigtop01.cloudera.org:8080/view/Releases/job/Bigtop-0.8.0/label=centos6/6/artifact/output/" hadoop_cluster_node::cluster_components: - hadoop - yarn bigtop::jdk_package_name: "java-1.7.0-openjdk-devel.x86_64" However, if I do not specify components in config, hence deploying all, I got the following error: Error: Could not find resource 'Service[tachyon-master]' for relationship on 'Service[tachyon-worker]' on node bigtop1.docker I'm not sure whether that is related to this patch. I'll look into this, or maybe jay vyas can take a glance on it Getting back to this patch, I'll go further to test the journal based HA. This is the huge improvement we'd like to have, making me really excited about trying it
        Hide
        michaelweiser Michael Weiser added a comment -

        Evans Ye: My fault on both counts. Sorry. I updated the README and restored the inclass to include in hadoop-hbase's init.pp. New patch attached.

        Show
        michaelweiser Michael Weiser added a comment - Evans Ye : My fault on both counts. Sorry. I updated the README and restored the inclass to include in hadoop-hbase's init.pp. New patch attached.
        Hide
        evans_ye Evans Ye added a comment - - edited

        Hey Michael Weiser, first of all, what a great work.
        I got some error during me test and would like to have your advice:

        I followed the updated README.md under puppet directory when configuring the deployment
        my /etc/puppet/hieradata/site.yaml:

        bigtop::hadoop_head_node: "bigtop1.docker"
        hadoop::hadoop_storage_dirs: "/data/1,/data/2"
        Bigtop::bigtop_yumrepo_uri: "http://bigtop01.cloudera.org:8080/view/Releases/job/Bigtop-0.8.0/label=centos6/6/artifact/output/"
        

        This brought me to:

        Error: suffix(): expected first argument to be an Array, got "/data/1,/data/2" at /bigtop-home/bigtop-deploy/puppet/modules/hadoop/manifests/init.pp:143 on node bigtop1.docker
        

        So I changed the configuration to this:

        hadoop::hadoop_storage_dirs:
        - /data/1
        - /data/2
        

        It seems passed, but then hit another one:

        Error: Unknown function inclass at /bigtop-home/bigtop-deploy/puppet/modules/hadoop-hbase/manifests/init.pp:49 on node bigtop1.docker
        

        I've my stdlib already installed under /etc/puppet/modules, so no crude about why this is happening, I'll try around later

        Show
        evans_ye Evans Ye added a comment - - edited Hey Michael Weiser , first of all, what a great work. I got some error during me test and would like to have your advice: I followed the updated README.md under puppet directory when configuring the deployment my /etc/puppet/hieradata/site.yaml : bigtop::hadoop_head_node: "bigtop1.docker" hadoop::hadoop_storage_dirs: "/data/1,/data/2" Bigtop::bigtop_yumrepo_uri: "http: //bigtop01.cloudera.org:8080/view/Releases/job/Bigtop-0.8.0/label=centos6/6/artifact/output/" This brought me to: Error: suffix(): expected first argument to be an Array, got "/data/1,/data/2" at /bigtop-home/bigtop-deploy/puppet/modules/hadoop/manifests/init.pp:143 on node bigtop1.docker So I changed the configuration to this: hadoop::hadoop_storage_dirs: - /data/1 - /data/2 It seems passed, but then hit another one: Error: Unknown function inclass at /bigtop-home/bigtop-deploy/puppet/modules/hadoop-hbase/manifests/init.pp:49 on node bigtop1.docker I've my stdlib already installed under /etc/puppet/modules , so no crude about why this is happening, I'll try around later
        Hide
        jayunit100 jay vyas added a comment -

        ah okay that was a good idea .. thanks for consolidating also for the final review!

        Show
        jayunit100 jay vyas added a comment - ah okay that was a good idea .. thanks for consolidating also for the final review!
        Hide
        michaelweiser Michael Weiser added a comment -

        jay vyas Here's the whole thing as one patch. I separated them intentionally to make it easier to spot the class/hiera changes.

        Show
        michaelweiser Michael Weiser added a comment - jay vyas Here's the whole thing as one patch. I separated them intentionally to make it easier to spot the class/hiera changes.
        Hide
        jayunit100 jay vyas added a comment - - edited

        Michael Weiser thanks! this looks like a huge improvement. Serious reduction in puppet complexity, retaining the same logic

        Show
        jayunit100 jay vyas added a comment - - edited Michael Weiser thanks! this looks like a huge improvement. Serious reduction in puppet complexity, retaining the same logic in general we use the https://cwiki.apache.org/confluence/display/BIGTOP/How+to+Contribute workflow git format-patch --stdout HEAD~1..HEAD for contributions, to ensure you get proper accredits, but im more than happy to roll these together for you after testing it out. I can test these today if nobody beats me to it.
        Hide
        michaelweiser Michael Weiser added a comment -

        patches as promised

        Show
        michaelweiser Michael Weiser added a comment - patches as promised

          People

          • Assignee:
            michaelweiser Michael Weiser
            Reporter:
            michaelweiser Michael Weiser
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development