Bigtop
  1. Bigtop
  2. BIGTOP-1037

Provide a mechanism to control the sourcing of defaults files

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.7.0
    • Component/s: None
    • Labels:
      None

      Description

      Ideally, defaults files should only contain name / value pairs (as per Debian packaging policy), which makes it difficult for a cluster management system that may be managing Hadoop to control the environment when these files are sources and variables are perhaps exported into the environment unconditionally.

      In keeping with Bigtop's goal if integrating Hadoop with the underlying operating system, the default behavior should not change at all, but it would provide greater flexibility if there was a mechanism to disable the sourcing of files so that if a system wants the environment variables to be used exclusively, the defaults files won't override that behavior.

      Obviously since this may be a more rare use case right now, the solution should also be as safe and non-disruptive as possible.

        Activity

        Hide
        Sean Mackrory added a comment -

        Pushed.

        Show
        Sean Mackrory added a comment - Pushed.
        Hide
        Roman Shaposhnik added a comment -

        Makes perfect sense. Thanks.

        Show
        Roman Shaposhnik added a comment - Makes perfect sense. Thanks.
        Hide
        Sean Mackrory added a comment -

        It was more so that you could not only point to an alternate location for defaults files, but could disable the use of defaults files entirely so that the existing environment was not modified at all. Without the -n, you'd have to just point to an empty or non-existent directory, and that just feels more hacky. Unless someone sets BIGTOP_DEFAULTS_DIR=/, I don't think we should be trying to source whatever file named 'hadoop' may be in people's root directory - so defensive programming is also a good reason

        Show
        Sean Mackrory added a comment - It was more so that you could not only point to an alternate location for defaults files, but could disable the use of defaults files entirely so that the existing environment was not modified at all. Without the -n, you'd have to just point to an empty or non-existent directory, and that just feels more hacky. Unless someone sets BIGTOP_DEFAULTS_DIR=/, I don't think we should be trying to source whatever file named 'hadoop' may be in people's root directory - so defensive programming is also a good reason
        Hide
        Roman Shaposhnik added a comment -

        In general – looks good to me: +1.

        One nit thought: why -n "$

        {BIGTOP_DEFAULTS_DIR}

        " ? Is it just a bit of defensive programming?

        Show
        Roman Shaposhnik added a comment - In general – looks good to me: +1. One nit thought: why -n "$ {BIGTOP_DEFAULTS_DIR} " ? Is it just a bit of defensive programming?
        Hide
        Sean Mackrory added a comment -

        That is probably wise - thanks, Bruno. Updated the patch...

        Show
        Sean Mackrory added a comment - That is probably wise - thanks, Bruno. Updated the patch...
        Hide
        Bruno Mahé added a comment -

        Sounds reasonable and the standardization looks great!
        My only concern is with the variable name. DEFAULTS_DIR could collide with some other variable. So could we prepend a BIGTOP_ to it?

        Show
        Bruno Mahé added a comment - Sounds reasonable and the standardization looks great! My only concern is with the variable name. DEFAULTS_DIR could collide with some other variable. So could we prepend a BIGTOP _ to it?
        Hide
        Sean Mackrory added a comment - - edited

        In this patch, the environment variable DEFAULTS_DIR can be used to specify a directory containing the files to source (it will default to /etc/default, as it should). It can also be set to the empty string to disable the sourcing of these files by the wrapper scripts and rely on the existing environment entirely.

        I've also taken the liberty of standardizing all sourcing to using the -r check to validate the existence and readability of the files first. In my opinion, it's hands-down the best approach we've taken, and this is just standardizing on that check. I also corrected a few scripts that were using #!/bin/sh to specify the bash shell. No specific reason to in these cases - just a good chance to make sure these scripts are being consistent about the shells.

        My opinion on the ideal behavior is above, but I should say that I'm not married to the coding style used here. I'd like to be succinct, since this shouldn't be the "default" use case, and I would like to have avoided syntax like the [ -n ... -a ...] && lines. I struggled to find a solution that resulted in identical behavior, was easier to read, and was relatively brief. If you have suggestions, please do share because I think that's one thing that could be improved.

        edit: I've tested a couple of scripts thoroughly and inspected the output of the init.d.tmpl template to make sure it's all consistent - unless someone disagrees, I considered that plus a thorough inspection for consistency to be sufficient testing for this.

        Show
        Sean Mackrory added a comment - - edited In this patch, the environment variable DEFAULTS_DIR can be used to specify a directory containing the files to source (it will default to /etc/default, as it should). It can also be set to the empty string to disable the sourcing of these files by the wrapper scripts and rely on the existing environment entirely. I've also taken the liberty of standardizing all sourcing to using the -r check to validate the existence and readability of the files first. In my opinion, it's hands-down the best approach we've taken, and this is just standardizing on that check. I also corrected a few scripts that were using #!/bin/sh to specify the bash shell. No specific reason to in these cases - just a good chance to make sure these scripts are being consistent about the shells. My opinion on the ideal behavior is above, but I should say that I'm not married to the coding style used here. I'd like to be succinct, since this shouldn't be the "default" use case, and I would like to have avoided syntax like the [ -n ... -a ...] && lines. I struggled to find a solution that resulted in identical behavior, was easier to read, and was relatively brief. If you have suggestions, please do share because I think that's one thing that could be improved. edit: I've tested a couple of scripts thoroughly and inspected the output of the init.d.tmpl template to make sure it's all consistent - unless someone disagrees, I considered that plus a thorough inspection for consistency to be sufficient testing for this.

          People

          • Assignee:
            Sean Mackrory
            Reporter:
            Sean Mackrory
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development