Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Environment:
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      What is new in HADOOP-5087:

      - Fixed the regex to correctly parse the chukwa agent ADD command.
      - Added a test case to validate the chukwa agent ADD command.

      Show
      What is new in HADOOP-5087 : - Fixed the regex to correctly parse the chukwa agent ADD command. - Added a test case to validate the chukwa agent ADD command.
    1. fixedregex.patch
      9 kB
      Ari Rabkin
    2. fixftaregex.patch
      1 kB
      Ari Rabkin
    3. HADOOP-5087.patch
      6 kB
      Jerome Boulon
    4. HADOOP-5087-2.patch
      9 kB
      Jerome Boulon
    5. reluctantregex.patch
      9 kB
      Ari Rabkin

      Issue Links

        Activity

        Hide
        jboulon Jerome Boulon added a comment -
        • Fix the regex to correctly parse the ADD command
        • Add a test case to validate the ADD command
        Show
        jboulon Jerome Boulon added a comment - Fix the regex to correctly parse the ADD command Add a test case to validate the ADD command
        Hide
        jboulon Jerome Boulon added a comment -

        Ps: The ChukwaTestAdaptor could be useful for testing the AdaptorManager implementation
        https://issues.apache.org/jira/browse/HADOOP-4893?focusedCommentId=12663916#action_12663916

        Show
        jboulon Jerome Boulon added a comment - Ps: The ChukwaTestAdaptor could be useful for testing the AdaptorManager implementation https://issues.apache.org/jira/browse/HADOOP-4893?focusedCommentId=12663916#action_12663916
        Hide
        asrabkin Ari Rabkin added a comment -

        Does this regex work if an adaptor has no parameters? Do we have a test case to cover this?

        Show
        asrabkin Ari Rabkin added a comment - Does this regex work if an adaptor has no parameters? Do we have a test case to cover this?
        Hide
        jboulon Jerome Boulon added a comment -

        Yes, HADOOP-5087-2.patch now contains 2 additional test case:

        • "ADD org.apache.hadoop.chukwa.datacollection.adaptor.ChukwaTestAdaptor chukwaTestAdaptorType 0 114027"
        • "ADD org.apache.hadoop.chukwa.datacollection.adaptor.ChukwaTestAdaptor chukwaTestAdaptorType 114027"
        Show
        jboulon Jerome Boulon added a comment - Yes, HADOOP-5087 -2.patch now contains 2 additional test case: "ADD org.apache.hadoop.chukwa.datacollection.adaptor.ChukwaTestAdaptor chukwaTestAdaptorType 0 114027" "ADD org.apache.hadoop.chukwa.datacollection.adaptor.ChukwaTestAdaptor chukwaTestAdaptorType 114027"
        Hide
        asrabkin Ari Rabkin added a comment -

        Awesome. +1

        Show
        asrabkin Ari Rabkin added a comment - Awesome. +1
        Hide
        jboulon Jerome Boulon added a comment -

        Junit failed because of HADOOP-5138

        Show
        jboulon Jerome Boulon added a comment - Junit failed because of HADOOP-5138
        Hide
        asrabkin Ari Rabkin added a comment -

        I think this is the right one – I fiddled for a while and checked a bunch of the test cases.

        Show
        asrabkin Ari Rabkin added a comment - I think this is the right one – I fiddled for a while and checked a bunch of the test cases.
        Hide
        asrabkin Ari Rabkin added a comment -

        This patch removes trailing spaces from adaptor parameters; I'm pretty sure this is the Right Thing.

        Show
        asrabkin Ari Rabkin added a comment - This patch removes trailing spaces from adaptor parameters; I'm pretty sure this is the Right Thing.
        Hide
        jboulon Jerome Boulon added a comment -

        The definition for the ADD command is:

        // words should contain (space delimited):
        // 0) command ("add")
        // 1) AdaptorClassname
        // 2) dataType (e.g. "hadoop_log")
        // 3) params <optional>
        // (e.g. for files, this is filename,
        // but can be arbitrarily many space
        // delimited agent specific params )
        // 4) offset

        How can you remove trailing spaces from adaptor parameters, this is adator specific and the adaptor should take care of that and this should not be automatically by the processCommand: HADOOP-5087-2.patch is doing that

        Current tests cases are failing for 2 reasons:
        -> space on the filename and the adaptor should be fixed
        -> A test case send some chunks to the queue but do not clean up after itself and the shutdown method on the agent is not doing any sort of cleanup since in the real world the agent is calling System.exit(0). The solution is to move that test in a separate test case. Since we are forking, it will be fine.

        Show
        jboulon Jerome Boulon added a comment - The definition for the ADD command is: // words should contain (space delimited): // 0) command ("add") // 1) AdaptorClassname // 2) dataType (e.g. "hadoop_log") // 3) params <optional> // (e.g. for files, this is filename, // but can be arbitrarily many space // delimited agent specific params ) // 4) offset How can you remove trailing spaces from adaptor parameters, this is adator specific and the adaptor should take care of that and this should not be automatically by the processCommand: HADOOP-5087 -2.patch is doing that Current tests cases are failing for 2 reasons: -> space on the filename and the adaptor should be fixed -> A test case send some chunks to the queue but do not clean up after itself and the shutdown method on the agent is not doing any sort of cleanup since in the real world the agent is calling System.exit(0). The solution is to move that test in a separate test case. Since we are forking, it will be fine.
        Hide
        asrabkin Ari Rabkin added a comment -

        Currently, no adaptors assume that their parameters can end with spaces. So we can change that part of the spec without breaking things. And I think it's generally more confusing than useful; if an adaptor needs parameters to end with spaces, they can quote their parameters.

        Show
        asrabkin Ari Rabkin added a comment - Currently, no adaptors assume that their parameters can end with spaces. So we can change that part of the spec without breaking things. And I think it's generally more confusing than useful; if an adaptor needs parameters to end with spaces, they can quote their parameters.
        Hide
        jboulon Jerome Boulon added a comment -

        The idea of HADOOP-4947 was to have a more flexible parsing for chukwa commands.
        Moving to regex was a good idea but the current regex to match the previous parsing (6-7 simple statements) seems to be very complicated and will be difficult to extend in the future.

        So, I'm asking if in order to keep it simple, shouldn't we revert back to something similar to the initial parsing?

        Show
        jboulon Jerome Boulon added a comment - The idea of HADOOP-4947 was to have a more flexible parsing for chukwa commands. Moving to regex was a good idea but the current regex to match the previous parsing (6-7 simple statements) seems to be very complicated and will be difficult to extend in the future. So, I'm asking if in order to keep it simple, shouldn't we revert back to something similar to the initial parsing?
        Hide
        asrabkin Ari Rabkin added a comment -

        I wouldn't revert. The previous code was very complex and difficult to extend, too. And had a number of quirks, or bugs, depending on what you think the proper behavior was. I think this is actually simpler. I vote keep.

        Show
        asrabkin Ari Rabkin added a comment - I wouldn't revert. The previous code was very complex and difficult to extend, too. And had a number of quirks, or bugs, depending on what you think the proper behavior was. I think this is actually simpler. I vote keep.
        Hide
        macyang Mac Yang added a comment -

        Regex is very powerful and could provide an elegant solution to the right problem. However, it's not the easiest thing to read and maintain.

        A typical answer to regex maintainability issue is to have detailed comment on the regex. O'Reilly has an article on how to maintain regex which I thought was quite useful (http://www.perl.com/pub/a/2004/01/16/regexps.html). I think we should do something like that if we want to take the regex approach.

        Show
        macyang Mac Yang added a comment - Regex is very powerful and could provide an elegant solution to the right problem. However, it's not the easiest thing to read and maintain. A typical answer to regex maintainability issue is to have detailed comment on the regex. O'Reilly has an article on how to maintain regex which I thought was quite useful ( http://www.perl.com/pub/a/2004/01/16/regexps.html ). I think we should do something like that if we want to take the regex approach.
        Hide
        asrabkin Ari Rabkin added a comment -

        Comments are good. It should be easy to split the regex into pieces with comments and I'm happy to do it. But we should decide exactly what the behavior we want is, in the case where you have multiple spaces between an Adaptor's parameters and the starting offset. Which spaces belong to the parameter, and which are discarded?
        That is, suppose have something that looks like:
        add ...FileTailingAdaptor... foo 10
        Is the filename "foo" or "foo " or?

        This is basically a matter of taste. I vote for the former; I think Jerome preferrs the latter. Other opinions?

        Show
        asrabkin Ari Rabkin added a comment - Comments are good. It should be easy to split the regex into pieces with comments and I'm happy to do it. But we should decide exactly what the behavior we want is, in the case where you have multiple spaces between an Adaptor's parameters and the starting offset. Which spaces belong to the parameter, and which are discarded? That is, suppose have something that looks like: add ...FileTailingAdaptor... foo 10 Is the filename "foo" or "foo " or? This is basically a matter of taste. I vote for the former; I think Jerome preferrs the latter. Other opinions?
        Hide
        eyang Eric Yang added a comment -

        +1 on "foo".

        Show
        eyang Eric Yang added a comment - +1 on "foo".
        Hide
        macyang Mac Yang added a comment -

        another +1 for "foo"

        Show
        macyang Mac Yang added a comment - another +1 for "foo"
        Hide
        asrabkin Ari Rabkin added a comment -

        You asked for it, you got it. Now with comments, and reluctant matching.

        Show
        asrabkin Ari Rabkin added a comment - You asked for it, you got it. Now with comments, and reluctant matching.
        Hide
        asrabkin Ari Rabkin added a comment -

        This has been languishing for a while and I'd like to resolve it; I don't want to commit anything without a little more consensus, though.

        Thoughts on the most recent patch? [Is there a way to trigger hudson to re-check a patch?]

        Show
        asrabkin Ari Rabkin added a comment - This has been languishing for a while and I'd like to resolve it; I don't want to commit anything without a little more consensus, though. Thoughts on the most recent patch? [Is there a way to trigger hudson to re-check a patch?]
        Hide
        asrabkin Ari Rabkin added a comment -

        FTA doesn't parse its argument correctly.

        Show
        asrabkin Ari Rabkin added a comment - FTA doesn't parse its argument correctly.
        Hide
        asrabkin Ari Rabkin added a comment -

        patch should only affect file tailing adaptor

        Show
        asrabkin Ari Rabkin added a comment - patch should only affect file tailing adaptor
        Hide
        asrabkin Ari Rabkin added a comment -

        existing tests cover this case, none added

        Show
        asrabkin Ari Rabkin added a comment - existing tests cover this case, none added
        Hide
        eyang Eric Yang added a comment -

        +1 on fixftaregex.patch

        Show
        eyang Eric Yang added a comment - +1 on fixftaregex.patch
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12401259/fixftaregex.patch
        against trunk revision 749318.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no tests are needed for this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/39/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/39/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/39/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/39/console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12401259/fixftaregex.patch against trunk revision 749318. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/39/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/39/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/39/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/39/console This message is automatically generated.
        Hide
        hudson Hudson added a comment -
        Show
        hudson Hudson added a comment - Integrated in Hadoop-trunk #778 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/778/ )

          People

          • Assignee:
            asrabkin Ari Rabkin
            Reporter:
            jboulon Jerome Boulon
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development