Uploaded image for project: 'Bigtop'
  1. Bigtop
  2. BIGTOP-1690

Puppet should automatically create data directories

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 0.5.0
    • Fix Version/s: 1.1.0
    • Component/s: deployment
    • Labels:
      None

      Description

      Right now a user has to manually create directories specified in hadoop::hadoop_storage_dirs, which is a shame. Let's have a recursive dirs creation if they don't exist yet.

      1. BIGTOP-1690.1.patch
        1 kB
        Sergey Soldatov
      2. BIGTOP-1690.2.patch
        1 kB
        Sergey Soldatov
      3. BIGTOP-1690.3.patch
        1 kB
        Sergey Soldatov

        Issue Links

          Activity

          Hide
          michaelweiser Michael Weiser added a comment -

          Hi Konstantin Boudnik,

          same problem here. Puppet won't help us natively because they won't implement it in the file type because of corner cases like who should own the parents and what permissions should they have: https://projects.puppetlabs.com/issues/86. They recommend just executing mkdir -p: https://github.com/ghoneycutt/puppet-module-common/blob/master/manifests/mkdir_p.pp. If it is acceptable I can certainly give implementing it a whack.

          Show
          michaelweiser Michael Weiser added a comment - Hi Konstantin Boudnik , same problem here. Puppet won't help us natively because they won't implement it in the file type because of corner cases like who should own the parents and what permissions should they have: https://projects.puppetlabs.com/issues/86 . They recommend just executing mkdir -p: https://github.com/ghoneycutt/puppet-module-common/blob/master/manifests/mkdir_p.pp . If it is acceptable I can certainly give implementing it a whack.
          Hide
          cos Konstantin Boudnik added a comment -

          Indeed. That's what I was referring to... If the parent dirs exist, then mkdir -p won't change their permissions or do anything harmful. At the same time, the cluster would be brought up without any manual intervention by the user.

          Show
          cos Konstantin Boudnik added a comment - Indeed. That's what I was referring to... If the parent dirs exist, then mkdir -p won't change their permissions or do anything harmful. At the same time, the cluster would be brought up without any manual intervention by the user.
          Hide
          cos Konstantin Boudnik added a comment -

          Michael Weiser, are you still interested in this ticket?

          Show
          cos Konstantin Boudnik added a comment - Michael Weiser , are you still interested in this ticket?
          Hide
          michaelweiser Michael Weiser added a comment -

          hi Konstantin Boudnik: I'm still facing the same problem and still need a solution for it for our own project but was very low on time to work on it for the last months. It's escalating on my end as well so that I think I should have a patch for it during the next two weeks.

          Show
          michaelweiser Michael Weiser added a comment - hi Konstantin Boudnik : I'm still facing the same problem and still need a solution for it for our own project but was very low on time to work on it for the last months. It's escalating on my end as well so that I think I should have a patch for it during the next two weeks.
          Hide
          sergey.soldatov Sergey Soldatov added a comment -

          Since puppet doesn't support recursive directory creation there are two options:
          1. execute mkdir -p
          2. add a new parameter like hadoop::hdfs_base_dirs and create it by regular way before hdfs data dirs are created.

          The second way is more preferable according to the puppet guidelines, but require adding an additional property to the site.yaml

          Show
          sergey.soldatov Sergey Soldatov added a comment - Since puppet doesn't support recursive directory creation there are two options: 1. execute mkdir -p 2. add a new parameter like hadoop::hdfs_base_dirs and create it by regular way before hdfs data dirs are created. The second way is more preferable according to the puppet guidelines, but require adding an additional property to the site.yaml
          Hide
          cos Konstantin Boudnik added a comment -

          I honestly prefer mkdir -p path. Our case isn't very complex and it should work just fine. We had an offline chat with Richard Pelavin and he's suggesting the same.

          Show
          cos Konstantin Boudnik added a comment - I honestly prefer mkdir -p path. Our case isn't very complex and it should work just fine. We had an offline chat with Richard Pelavin and he's suggesting the same.
          Hide
          sergey.soldatov Sergey Soldatov added a comment -

          added mkdir -p for storage dirs.

          Show
          sergey.soldatov Sergey Soldatov added a comment - added mkdir -p for storage dirs.
          Hide
          cos Konstantin Boudnik added a comment -

          I would actually replace the test with

            creates => "$name",
          

          in the exec section.

          Show
          cos Konstantin Boudnik added a comment - I would actually replace the test with creates => "$name" , in the exec section.
          Hide
          sergey.soldatov Sergey Soldatov added a comment -

          Changed to 'creates'

          Show
          sergey.soldatov Sergey Soldatov added a comment - Changed to 'creates'
          Hide
          cos Konstantin Boudnik added a comment -

          Yup, looks good. I will test it shortly as a part of something I am doing right now, and commit later in the day unless I hear otherwise from our Puppet gurus.

          Show
          cos Konstantin Boudnik added a comment - Yup, looks good. I will test it shortly as a part of something I am doing right now, and commit later in the day unless I hear otherwise from our Puppet gurus.
          Hide
          cos Konstantin Boudnik added a comment -

          Restating my earlier deleted comment: I still think the triggering of the storage directory creation needs to be tight to hdfs-only functionality. Perhaps the call into the function needs to be moved to the class common_hdfs or something? Richard Pelavin, any suggestions?

          Show
          cos Konstantin Boudnik added a comment - Restating my earlier deleted comment: I still think the triggering of the storage directory creation needs to be tight to hdfs-only functionality. Perhaps the call into the function needs to be moved to the class common_hdfs or something? Richard Pelavin , any suggestions?
          Hide
          sergey.soldatov Sergey Soldatov added a comment -

          Nope. common_yarn which is not related to common_hdfs is using storage dir as well.

          Show
          sergey.soldatov Sergey Soldatov added a comment - Nope. common_yarn which is not related to common_hdfs is using storage dir as well.
          Hide
          rnp Richard Pelavin added a comment - - edited

          With respect to "mkdir -p", as stated in comments above, there are only some corner cases when this will lead to problems; so think it makes sense here.

          With respect to where should create_storage_dir be called, this depends on whether there are any roles that include common_hdfs where you dont want these directories created. For example dont think you want these created on a client.

          Easiest way to restrict what roles create directories is to nest it under a conditional
          So, Rather than simply calling

          create_storage_dir ( $hadoop_storage_dirs: }

          you can nest it under a conditional, like

          class deploy ($roles) {
          if ("datanode" in $roles) or #..disjunct for any other role that needs it
          create_storage_dir { $hadoop_storage_dirs: }
          }

          Show
          rnp Richard Pelavin added a comment - - edited With respect to "mkdir -p", as stated in comments above, there are only some corner cases when this will lead to problems; so think it makes sense here. With respect to where should create_storage_dir be called, this depends on whether there are any roles that include common_hdfs where you dont want these directories created. For example dont think you want these created on a client. Easiest way to restrict what roles create directories is to nest it under a conditional So, Rather than simply calling create_storage_dir ( $hadoop_storage_dirs: } you can nest it under a conditional, like class deploy ($roles) { if ("datanode" in $roles) or #..disjunct for any other role that needs it create_storage_dir { $hadoop_storage_dirs: } }
          Hide
          cos Konstantin Boudnik added a comment -

          Yeah, that make sense. I like it. And considering that even if the roles are disabled (as by default) this will still work because in this case we automatically consider all nodes of a cluster to be worker nodes, IIRC.

          Show
          cos Konstantin Boudnik added a comment - Yeah, that make sense. I like it. And considering that even if the roles are disabled (as by default) this will still work because in this case we automatically consider all nodes of a cluster to be worker nodes, IIRC.
          Hide
          sergey.soldatov Sergey Soldatov added a comment -

          Something like that?

          Show
          sergey.soldatov Sergey Soldatov added a comment - Something like that?
          Hide
          cos Konstantin Boudnik added a comment -

          Looks good! Unless there are other suggestions I will commit it soon.

          Show
          cos Konstantin Boudnik added a comment - Looks good! Unless there are other suggestions I will commit it soon.
          Hide
          cos Konstantin Boudnik added a comment -

          Pushed to the master. Thanks Sergey Soldatov

          Show
          cos Konstantin Boudnik added a comment - Pushed to the master. Thanks Sergey Soldatov
          Hide
          michaelweiser Michael Weiser added a comment -

          Hi Guys, sorry for dropping the ball on this. Just today I came around to looking into this again. I came up with what I think are a couple of minor improvements that I put in a new JIRA at BIGTOP-2153. Feel free to comment on it.
          Thanks,
          Michael

          Show
          michaelweiser Michael Weiser added a comment - Hi Guys, sorry for dropping the ball on this. Just today I came around to looking into this again. I came up with what I think are a couple of minor improvements that I put in a new JIRA at BIGTOP-2153 . Feel free to comment on it. Thanks, Michael

            People

            • Assignee:
              sergey.soldatov Sergey Soldatov
              Reporter:
              cos Konstantin Boudnik
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development