Bigtop
  1. Bigtop
  2. BIGTOP-566

Flume NG pkg init script should allow user to customize the location of the conf dir

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.4.0
    • Fix Version/s: None
    • Component/s: general
    • Labels:
      None
    • Environment:

      all

      Description

      A typical Flume NG invocation is:
      bin/flume-ng --conf-file conf/flume-conf.properties.template --conf conf --name agent1

      ...and the corresponding invocation after installing the Flume NG bigtop package would be:
      /usr/lib/flume-ng/bin/flume-ng --conf-file /etc/flume-ng/conf/flume.conf --conf /etc/flume-ng/conf --name agent

      The Flume NG init script, installed as part of Bigtop Flume NG, allows us to modify:
      1) the --conf-file value (via FLUME_CONF_FILE)
      2) the --name value (via FLUME_AGENT_NAME)

      ...but not the --conf value. So the user loses some flexibility when using Bigtop Flume NG over Apache Flume NG.

      I recommend that the /etc/init.d/flume-ng-agent script be modified to allow the user to set FLUME_CONF_DIR. That is, in this init script, please change this:
      FLUME_CONF_DIR=/etc/flume-ng/conf
      ..to this:
      FLUME_CONF_DIR=$

      {FLUME_CONF_DIR:-/etc/flume-ng/conf}
      1. BIGTOP-566.patch
        0.7 kB
        Will McQueen
      2. BIGTOP-566.patch
        0.7 kB
        Will McQueen

        Activity

        Hide
        Roman Shaposhnik added a comment -

        Couple of comments:

        1. we strongly discourage use of upstream scripts in Bigtop, so the invocation will be /usr/bin/flume-ng --conf-file /etc/flume-ng/conf/flume.conf --conf /etc/flume-ng/conf --name agent
        2. my understanding of flume is that it is an agent-based system with long running agents on each node having a configuration that changes only when the agent gets restarted. If that's the case then I'd like to propose that the details of --conf-file get hidden from the end user. on a headless node the only configuration that flume can rely on would be a static set of configs under /etc/flume/conf. And that's what should be used by the agent.

        Please let me know if this makes sense.

        Show
        Roman Shaposhnik added a comment - Couple of comments: we strongly discourage use of upstream scripts in Bigtop, so the invocation will be /usr/bin/flume-ng --conf-file /etc/flume-ng/conf/flume.conf --conf /etc/flume-ng/conf --name agent my understanding of flume is that it is an agent-based system with long running agents on each node having a configuration that changes only when the agent gets restarted. If that's the case then I'd like to propose that the details of --conf-file get hidden from the end user. on a headless node the only configuration that flume can rely on would be a static set of configs under /etc/flume/conf. And that's what should be used by the agent. Please let me know if this makes sense.
        Hide
        Will McQueen added a comment -

        Fix to allow FLUME_CONF_DIR to be set by user. Patch is based on trunk.

        Show
        Will McQueen added a comment - Fix to allow FLUME_CONF_DIR to be set by user. Patch is based on trunk.
        Hide
        Will McQueen added a comment -

        Fix to allow FLUME_CONF_DIR to be set by user. Patch is based on trunk. This time I clicked "Grant license"

        Show
        Will McQueen added a comment - Fix to allow FLUME_CONF_DIR to be set by user. Patch is based on trunk. This time I clicked "Grant license"
        Hide
        Roman Shaposhnik added a comment -

        Will, thanks for the patch, but please answer my questions below. I still don't quite understand what does this level of flexibility accomplish for the service script. I can understand why explicit invocations of bin/flume-ng could use this flexibility but that is already the case, right?

        Show
        Roman Shaposhnik added a comment - Will, thanks for the patch, but please answer my questions below. I still don't quite understand what does this level of flexibility accomplish for the service script. I can understand why explicit invocations of bin/flume-ng could use this flexibility but that is already the case, right?
        Hide
        Will McQueen added a comment -

        Hi Roman,

        I saw your comment after I already posted the patch. For #1, I agree. For #2:

        >>with long running agents on each node having a configuration that changes only when the agent gets restarted
        The configuration can change at any time. There's a poller thread that checks for config changes to the file every 30 secs.

        I'm not sure what the right answer is to your other questions. For launching Flume NG, I was comparing the level of customizability that Apache Flume NG offers to the way that Bigtop packaging uses the init script to call flume-ng. But maybe that's the wrong comparison. So let me back up.

        The init script currently allows FLUME_AGENT_NAME and FLUME_CONF_FILE to be taken from the environment so that the user can customize it. But maybe init scripts should not be customizable? Maybe the init script should enforce a non-overridable FLUME_AGENT_NAME ("agent") and FLUME_CONF_FILE ("/etc/flume-ng/conf/flume.conf")? Maybe init scripts in general should enforce a particular common scenario, like supporting only a single Flume agent to run on the host and with specific options, and any deviation from that scenario should be handled by directly running the target binary ("/usr/bin/flume-ng" in this case) with custom options for things like --conf, --conf-file, and --name? Or, maybe if the init script is customizable, then it could use shell fragments similar to what /usr/libexec/bigtop-detect-javahome does when it calls /etc/default/bigtop-utils?

        I'd appreciate any light you could shed on standard conventions in this regard. Thank you.

        Cheers,
        Will

        Show
        Will McQueen added a comment - Hi Roman, I saw your comment after I already posted the patch. For #1, I agree. For #2: >>with long running agents on each node having a configuration that changes only when the agent gets restarted The configuration can change at any time. There's a poller thread that checks for config changes to the file every 30 secs. I'm not sure what the right answer is to your other questions. For launching Flume NG, I was comparing the level of customizability that Apache Flume NG offers to the way that Bigtop packaging uses the init script to call flume-ng. But maybe that's the wrong comparison. So let me back up. The init script currently allows FLUME_AGENT_NAME and FLUME_CONF_FILE to be taken from the environment so that the user can customize it. But maybe init scripts should not be customizable? Maybe the init script should enforce a non-overridable FLUME_AGENT_NAME ("agent") and FLUME_CONF_FILE ("/etc/flume-ng/conf/flume.conf")? Maybe init scripts in general should enforce a particular common scenario, like supporting only a single Flume agent to run on the host and with specific options, and any deviation from that scenario should be handled by directly running the target binary ("/usr/bin/flume-ng" in this case) with custom options for things like --conf, --conf-file, and --name? Or, maybe if the init script is customizable, then it could use shell fragments similar to what /usr/libexec/bigtop-detect-javahome does when it calls /etc/default/bigtop-utils? I'd appreciate any light you could shed on standard conventions in this regard. Thank you. Cheers, Will

          People

          • Assignee:
            Bruno Mahé
            Reporter:
            Will McQueen
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:

              Development