Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1978

[Rumen] TraceBuilder should provide recursive input folder scanning

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.23.0
    • Component/s: tools/rumen
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Adds -recursive option to TraceBuilder for scanning the input directories recursively.

      Description

      Currently, TraceBuilder assumes that the input is either jobhistory files or a folders containing jobhistory files directly underneath the specified folder. There could be a use cases where the input folder could contain sub-folders containing jobhistory files. Rumen should support such input folders.

      1. 1978.v1.5.patch
        18 kB
        Ravi Gummadi
      2. 1978.v1.4.patch
        18 kB
        Ravi Gummadi
      3. 1978.v1.patch
        20 kB
        Ravi Gummadi
      4. 1978.patch
        16 kB
        Ravi Gummadi

        Issue Links

          Activity

          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Patch Available Patch Available Open Open
          199d 18h 44m 2 Ravi Gummadi 29/Apr/11 05:36
          Open Open Patch Available Patch Available
          74d 23h 29m 3 Ravi Gummadi 29/Apr/11 05:37
          Patch Available Patch Available Resolved Resolved
          3d 4h 53m 1 Ravi Gummadi 02/May/11 10:30
          Resolved Resolved Closed Closed
          196d 15h 19m 1 Arun C Murthy 15/Nov/11 00:49
          Arun C Murthy made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #656 (See https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/656/)

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #656 (See https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/656/ )
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #669 (See https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk/669/)

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #669 (See https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk/669/ )
          Ravi Gummadi made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags [Reviewed]
          Release Note Adds -recursive option to TraceBuilder for scanning the input directories recursively.
          Resolution Fixed [ 1 ]
          Hide
          Ravi Gummadi added a comment -

          The unit tests failures are not related to this patch.

          I just committed the patch to trunk.

          Show
          Ravi Gummadi added a comment - The unit tests failures are not related to this patch. I just committed the patch to trunk.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12477724/1978.v1.5.patch
          against trunk revision 1097679.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.cli.TestMRCLI
          org.apache.hadoop.tools.TestHadoopArchives
          org.apache.hadoop.tools.TestHarFileSystem

          -1 contrib tests. The patch failed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/198//testReport/
          Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/198//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/198//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12477724/1978.v1.5.patch against trunk revision 1097679. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestMRCLI org.apache.hadoop.tools.TestHadoopArchives org.apache.hadoop.tools.TestHarFileSystem -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/198//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/198//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/198//console This message is automatically generated.
          Hide
          Amar Kamat added a comment -

          The latest patch looks good to me. +1.

          Show
          Amar Kamat added a comment - The latest patch looks good to me. +1.
          Ravi Gummadi made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Fix Version/s 0.23.0 [ 12315570 ]
          Ravi Gummadi made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Ravi Gummadi made changes -
          Attachment 1978.v1.5.patch [ 12477724 ]
          Hide
          Ravi Gummadi added a comment -

          Attaching new patch making some minor indentation changes.

          Show
          Ravi Gummadi added a comment - Attaching new patch making some minor indentation changes.
          Hide
          Ravi Gummadi added a comment -

          Failed unit test is not related to this patch at all.

          Show
          Ravi Gummadi added a comment - Failed unit test is not related to this patch at all.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12475992/1978.v1.4.patch
          against trunk revision 1090390.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/163//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/163//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/163//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12475992/1978.v1.4.patch against trunk revision 1090390. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/163//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/163//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/163//console This message is automatically generated.
          Ravi Gummadi made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Ravi Gummadi made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Ravi Gummadi made changes -
          Attachment 1978.v1.4.patch [ 12475992 ]
          Hide
          Ravi Gummadi added a comment -

          Attaching new patch so that it applies cleanly to latest trunk.

          No new code changes compared to earlier patch.

          Show
          Ravi Gummadi added a comment - Attaching new patch so that it applies cleanly to latest trunk. No new code changes compared to earlier patch.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12456579/1978.v1.patch
          against trunk revision 1074251.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.tools.rumen.TestRumenJobTraces

          -1 contrib tests. The patch failed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/73//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/73//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/73//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12456579/1978.v1.patch against trunk revision 1074251. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.tools.rumen.TestRumenJobTraces -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/73//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/73//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/73//console This message is automatically generated.
          Hide
          Ravi Gummadi added a comment -

          find and awk actually are doing the work correctly and we want to compare that behavior against the main code changes of this patch. It is validating lot of things like (1) globbed path given to fs.globStatus() (2) fs.listFiles() with and without -recursive option and (3) sorting of filenames alone instead of sorting the whole paths.

          Ran "ant test" now. Tests passed.

          Show
          Ravi Gummadi added a comment - find and awk actually are doing the work correctly and we want to compare that behavior against the main code changes of this patch. It is validating lot of things like (1) globbed path given to fs.globStatus() (2) fs.listFiles() with and without -recursive option and (3) sorting of filenames alone instead of sorting the whole paths. Ran "ant test" now. Tests passed.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          The patch looks good to me too. I'd be happy if the test can be modified to do away with all the find/awk kludge - for e.g. createHistoryLogsHierarchy() knows the list of files being created and so can return the same; for the flat list, File.listFiles() should suffice.

          The test results were posted a while ago, please run them once more after the test changes just to be sure.

          Show
          Vinod Kumar Vavilapalli added a comment - The patch looks good to me too. I'd be happy if the test can be modified to do away with all the find/awk kludge - for e.g. createHistoryLogsHierarchy() knows the list of files being created and so can return the same; for the flat list, File.listFiles() should suffice. The test results were posted a while ago, please run them once more after the test changes just to be sure.
          Hide
          Ravi Gummadi added a comment -

          Unit tests passed on my local machine.

          ant test-patch gave:

          [exec] +1 overall.
          [exec]
          [exec] +1 @author. The patch does not contain any @author tags.
          [exec]
          [exec] +1 tests included. The patch appears to include 3 new or modified tests.
          [exec]
          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
          [exec]
          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
          [exec]
          [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
          [exec]
          [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
          [exec]
          [exec] +1 system tests framework. The patch passed system tests framework compile.

          Show
          Ravi Gummadi added a comment - Unit tests passed on my local machine. ant test-patch gave: [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 system tests framework. The patch passed system tests framework compile.
          Ravi Gummadi made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Amar Kamat added a comment -

          +1. Looks good to me.

          Show
          Amar Kamat added a comment - +1. Looks good to me.
          Ravi Gummadi made changes -
          Attachment 1978.v1.patch [ 12456579 ]
          Hide
          Ravi Gummadi added a comment -

          Attaching new patch incorporating offline review comments from Amar and Ranjit.

          Show
          Ravi Gummadi added a comment - Attaching new patch incorporating offline review comments from Amar and Ranjit.
          Hide
          Ravi Gummadi added a comment -

          -r is not descriptive enough compared to -recursive. May be later we could think of supporting "-R" in addition to -recursive, if needed.

          Show
          Ravi Gummadi added a comment - -r is not descriptive enough compared to -recursive. May be later we could think of supporting "-R" in addition to -recursive, if needed.
          Hide
          Vinay Kumar Thota added a comment -

          Patch looks good to me. However, why don't you use -r option instead of -recursive? In general -r represents the recursive mode right.

          Show
          Vinay Kumar Thota added a comment - Patch looks good to me. However, why don't you use -r option instead of -recursive? In general -r represents the recursive mode right.
          Ravi Gummadi made changes -
          Attachment 1978.patch [ 12455353 ]
          Hide
          Ravi Gummadi added a comment -

          Attaching patch that adds the option "-recursive" to TraceBuilder.

          With -recursive option, TraceBuilder generates trace by scanning all the job history logs recursively under the given path.

          Please review the patch and provide your comments.

          Show
          Ravi Gummadi added a comment - Attaching patch that adds the option "-recursive" to TraceBuilder. With -recursive option, TraceBuilder generates trace by scanning all the job history logs recursively under the given path. Please review the patch and provide your comments.
          Ravi Gummadi made changes -
          Assignee Amar Kamat [ amar_kamat ] Ravi Gummadi [ ravidotg ]
          Hide
          Ranjit Mathew added a comment -

          The need to fix this issue will become even stronger once MAPREDUCE-323 is fixed.

          Show
          Ranjit Mathew added a comment - The need to fix this issue will become even stronger once MAPREDUCE-323 is fixed.
          Ranjit Mathew made changes -
          Link This issue relates to MAPREDUCE-323 [ MAPREDUCE-323 ]
          Amar Kamat made changes -
          Description Currently, Rumen assumes that the input is either jobhistory files or a folders containing jobhistory files directly underneath the specified folder. There could be a use cases where the input folder could contain sub-folders containing jobhistory files. Rumen should support such input folders. Currently, {{TraceBuilder}} assumes that the input is either jobhistory files or a folders containing jobhistory files directly underneath the specified folder. There could be a use cases where the input folder could contain sub-folders containing jobhistory files. Rumen should support such input folders.
          Amar Kamat made changes -
          Field Original Value New Value
          Assignee Amar Kamat [ amar_kamat ]
          Amar Kamat created issue -

            People

            • Assignee:
              Ravi Gummadi
              Reporter:
              Amar Kamat
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development