Bigtop
  1. Bigtop
  2. BIGTOP-829

Hue status returned wrong information on SLES

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.5.0
    • Fix Version/s: 0.6.0
    • Component/s: rpm
    • Labels:
      None

      Description

      BIGTOP-782 fixed the stop functionality for hue on SLES. We still need to fix the status functionality for SLES.

      Presently, the wrong status is reported.

      1. BIGTOP-829.1.patch
        1 kB
        Mark Grover
      2. BIGTOP-829.2.patch
        0.8 kB
        Mark Grover
      3. BIGTOP-829.3.patch
        1 kB
        Mark Grover

        Activity

        Hide
        Roman Shaposhnik added a comment -

        Mark it looks like you take code from the first patch that was attached to the BIGTOP-782, but I believe the actual commit ended up being based on https://issues.apache.org/jira/secure/attachment/12556375/0002-BIGTOP-782.-service-hue-status-still-show-failed-aft.patch

        Any chance you can do the same kind of thing?

        Show
        Roman Shaposhnik added a comment - Mark it looks like you take code from the first patch that was attached to the BIGTOP-782 , but I believe the actual commit ended up being based on https://issues.apache.org/jira/secure/attachment/12556375/0002-BIGTOP-782.-service-hue-status-still-show-failed-aft.patch Any chance you can do the same kind of thing?
        Hide
        Mark Grover added a comment -

        Roman, thanks for the feedback.

        The reason I used the pidofproc (from patch1 of BIGTOP-782) instead of kill -0 approach (from patch 2) was because pidofproc's exit code is one of the LSB defined exit status codes for "status".
        Consequently, the exit code for pidofproc can be:

        0	program is running or service is OK
        1	program is dead and /var/run pid file exists
        2	program is dead and /var/lock lock file exists
        3	program is not running
        4	program or service status is unknown
        5-99	reserved for future LSB use
        100-149	reserved for distribution use
        150-199	reserved for application use
        200-254	reserved
        

        If we use kill -0 for status, I wasn't able to find concrete documentation as to what the exit code would be. The best I could find was "0" if process is running and "1" if it wasn't. So, the non-success error codes wouldn't have been compatible with pidofproc.
        Moreover, other components (I looked at common/flume/flume-agent.init) seem to doing similar things, using kill commands for stopping but pidofproc for status.

        For stopping the process, kill commands are ok because we explicitly return 0 but to me it seemed like we should continue to use pidofproc for status.

        What do you think?

        References:
        http://refspecs.linuxbase.org/LSB_4.1.0/LSB-Core-generic/LSB-Core-generic/iniscrptfunc.html
        http://refspecs.linuxbase.org/LSB_3.1.1/LSB-Core-generic/LSB-Core-generic/iniscrptact.html

        Show
        Mark Grover added a comment - Roman, thanks for the feedback. The reason I used the pidofproc (from patch1 of BIGTOP-782 ) instead of kill -0 approach (from patch 2) was because pidofproc's exit code is one of the LSB defined exit status codes for "status". Consequently, the exit code for pidofproc can be: 0 program is running or service is OK 1 program is dead and / var /run pid file exists 2 program is dead and / var /lock lock file exists 3 program is not running 4 program or service status is unknown 5-99 reserved for future LSB use 100-149 reserved for distribution use 150-199 reserved for application use 200-254 reserved If we use kill -0 for status, I wasn't able to find concrete documentation as to what the exit code would be. The best I could find was "0" if process is running and "1" if it wasn't. So, the non-success error codes wouldn't have been compatible with pidofproc. Moreover, other components (I looked at common/flume/flume-agent.init) seem to doing similar things, using kill commands for stopping but pidofproc for status. For stopping the process, kill commands are ok because we explicitly return 0 but to me it seemed like we should continue to use pidofproc for status. What do you think? References: http://refspecs.linuxbase.org/LSB_4.1.0/LSB-Core-generic/LSB-Core-generic/iniscrptfunc.html http://refspecs.linuxbase.org/LSB_3.1.1/LSB-Core-generic/LSB-Core-generic/iniscrptact.html
        Hide
        Johnny Zhang added a comment -

        it seems reasonable to me that we can get more debug information from the non-success code. As long we can keep it consistent cross the board, I think it is a good idea to migrate to it.

        Show
        Johnny Zhang added a comment - it seems reasonable to me that we can get more debug information from the non-success code. As long we can keep it consistent cross the board, I think it is a good idea to migrate to it.
        Hide
        Roman Shaposhnik added a comment -

        Understood, my only concern is that the mucking up with sed and stuff was considered to be iffy for the original patch, so I'm not sure we should be resurrecting it.

        If we do – perhaps we then should reapply the original patch to the BIGTOP-782 as well.

        Thoughts?

        Show
        Roman Shaposhnik added a comment - Understood, my only concern is that the mucking up with sed and stuff was considered to be iffy for the original patch, so I'm not sure we should be resurrecting it. If we do – perhaps we then should reapply the original patch to the BIGTOP-782 as well. Thoughts?
        Hide
        Mark Grover added a comment -

        Thanks for your input, Johnny.

        Roman, I see what you mean. For checking the status, flume-agent does something like:

        pidofproc -p $FLUME_PID_FILE java > /dev/null
        

        hardcoding java as the process.

        Consequently, we can do something similar but the trouble is python is not python but python2.6 or python2.7, etc. so we can't hardcode the python version. Sed'ing seems imminent to me, unless you have some other ideas. Thoughts?

        Show
        Mark Grover added a comment - Thanks for your input, Johnny. Roman, I see what you mean. For checking the status, flume-agent does something like: pidofproc -p $FLUME_PID_FILE java > /dev/ null hardcoding java as the process. Consequently, we can do something similar but the trouble is python is not python but python2.6 or python2.7, etc. so we can't hardcode the python version. Sed'ing seems imminent to me, unless you have some other ideas. Thoughts?
        Hide
        Mark Grover added a comment -

        Actually I found another way, which will get rid of the ugly sed. If we use the "-L" flag in pidofproc, it will follow the system's python symlink, therefore we can refer the executable as just "python" instead of its version qualified name (e.g. "python2.6"), thereby removing the need for sed.

        Patch attached.

        Show
        Mark Grover added a comment - Actually I found another way, which will get rid of the ugly sed. If we use the "-L" flag in pidofproc, it will follow the system's python symlink, therefore we can refer the executable as just "python" instead of its version qualified name (e.g. "python2.6"), thereby removing the need for sed. Patch attached.
        Hide
        Bruno Mahé added a comment -

        the "-L" flag is not specified in the LSB (see http://refspecs.linuxfoundation.org/LSB_3.1.1/LSB-Core-generic/LSB-Core-generic/iniscrptfunc.html ).
        It may be a distro specific addition and not available everywhere.
        So is this flag available as well for sles11, centOS 5.X, 6.X and the ubuntu/debian?

        Show
        Bruno Mahé added a comment - the "-L" flag is not specified in the LSB (see http://refspecs.linuxfoundation.org/LSB_3.1.1/LSB-Core-generic/LSB-Core-generic/iniscrptfunc.html ). It may be a distro specific addition and not available everywhere. So is this flag available as well for sles11, centOS 5.X, 6.X and the ubuntu/debian?
        Hide
        Mark Grover added a comment -

        Bruno, that's a valid point. I only checked on SLES11 because the file being modified is SuSE specific hue.init.suse. Is that still a concern?

        Show
        Mark Grover added a comment - Bruno, that's a valid point. I only checked on SLES11 because the file being modified is SuSE specific hue.init.suse . Is that still a concern?
        Hide
        Bruno Mahé added a comment -

        Fair enough. That's not a concern then.
        Although I would add a comment or something so if someday we try to merge the init scripts, we would not miss that.

        +1

        Show
        Bruno Mahé added a comment - Fair enough. That's not a concern then. Although I would add a comment or something so if someday we try to merge the init scripts, we would not miss that. +1
        Hide
        Mark Grover added a comment -

        New patch with Bruno's suggestion. Also fixed up an old reference to PID file location in one of the comments.

        Show
        Mark Grover added a comment - New patch with Bruno's suggestion. Also fixed up an old reference to PID file location in one of the comments.
        Hide
        Mark Grover added a comment -

        Thanks Bruno! Uploaded a new patch based on your feedback.

        Show
        Mark Grover added a comment - Thanks Bruno! Uploaded a new patch based on your feedback.
        Hide
        Roman Shaposhnik added a comment -

        +1 and committed!

        Show
        Roman Shaposhnik added a comment - +1 and committed!

          People

          • Assignee:
            Mark Grover
            Reporter:
            Mark Grover
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development