Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1978

[Rumen] TraceBuilder should provide recursive input folder scanning

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.23.0
    • Component/s: tools/rumen
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Adds -recursive option to TraceBuilder for scanning the input directories recursively.

      Description

      Currently, TraceBuilder assumes that the input is either jobhistory files or a folders containing jobhistory files directly underneath the specified folder. There could be a use cases where the input folder could contain sub-folders containing jobhistory files. Rumen should support such input folders.

      1. 1978.patch
        16 kB
        Ravi Gummadi
      2. 1978.v1.patch
        20 kB
        Ravi Gummadi
      3. 1978.v1.4.patch
        18 kB
        Ravi Gummadi
      4. 1978.v1.5.patch
        18 kB
        Ravi Gummadi

        Issue Links

          Activity

          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #656 (See https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/656/)

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #656 (See https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/656/ )
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #669 (See https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk/669/)

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #669 (See https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk/669/ )
          Hide
          Ravi Gummadi added a comment -

          The unit tests failures are not related to this patch.

          I just committed the patch to trunk.

          Show
          Ravi Gummadi added a comment - The unit tests failures are not related to this patch. I just committed the patch to trunk.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12477724/1978.v1.5.patch
          against trunk revision 1097679.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.cli.TestMRCLI
          org.apache.hadoop.tools.TestHadoopArchives
          org.apache.hadoop.tools.TestHarFileSystem

          -1 contrib tests. The patch failed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/198//testReport/
          Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/198//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/198//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12477724/1978.v1.5.patch against trunk revision 1097679. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestMRCLI org.apache.hadoop.tools.TestHadoopArchives org.apache.hadoop.tools.TestHarFileSystem -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/198//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/198//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/198//console This message is automatically generated.
          Hide
          Amar Kamat added a comment -

          The latest patch looks good to me. +1.

          Show
          Amar Kamat added a comment - The latest patch looks good to me. +1.
          Hide
          Ravi Gummadi added a comment -

          Attaching new patch making some minor indentation changes.

          Show
          Ravi Gummadi added a comment - Attaching new patch making some minor indentation changes.
          Hide
          Ravi Gummadi added a comment -

          Failed unit test is not related to this patch at all.

          Show
          Ravi Gummadi added a comment - Failed unit test is not related to this patch at all.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12475992/1978.v1.4.patch
          against trunk revision 1090390.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/163//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/163//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/163//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12475992/1978.v1.4.patch against trunk revision 1090390. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/163//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/163//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/163//console This message is automatically generated.
          Hide
          Ravi Gummadi added a comment -

          Attaching new patch so that it applies cleanly to latest trunk.

          No new code changes compared to earlier patch.

          Show
          Ravi Gummadi added a comment - Attaching new patch so that it applies cleanly to latest trunk. No new code changes compared to earlier patch.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12456579/1978.v1.patch
          against trunk revision 1074251.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.tools.rumen.TestRumenJobTraces

          -1 contrib tests. The patch failed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/73//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/73//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/73//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12456579/1978.v1.patch against trunk revision 1074251. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.tools.rumen.TestRumenJobTraces -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/73//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/73//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/73//console This message is automatically generated.
          Hide
          Ravi Gummadi added a comment -

          find and awk actually are doing the work correctly and we want to compare that behavior against the main code changes of this patch. It is validating lot of things like (1) globbed path given to fs.globStatus() (2) fs.listFiles() with and without -recursive option and (3) sorting of filenames alone instead of sorting the whole paths.

          Ran "ant test" now. Tests passed.

          Show
          Ravi Gummadi added a comment - find and awk actually are doing the work correctly and we want to compare that behavior against the main code changes of this patch. It is validating lot of things like (1) globbed path given to fs.globStatus() (2) fs.listFiles() with and without -recursive option and (3) sorting of filenames alone instead of sorting the whole paths. Ran "ant test" now. Tests passed.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          The patch looks good to me too. I'd be happy if the test can be modified to do away with all the find/awk kludge - for e.g. createHistoryLogsHierarchy() knows the list of files being created and so can return the same; for the flat list, File.listFiles() should suffice.

          The test results were posted a while ago, please run them once more after the test changes just to be sure.

          Show
          Vinod Kumar Vavilapalli added a comment - The patch looks good to me too. I'd be happy if the test can be modified to do away with all the find/awk kludge - for e.g. createHistoryLogsHierarchy() knows the list of files being created and so can return the same; for the flat list, File.listFiles() should suffice. The test results were posted a while ago, please run them once more after the test changes just to be sure.
          Hide
          Ravi Gummadi added a comment -

          Unit tests passed on my local machine.

          ant test-patch gave:

          [exec] +1 overall.
          [exec]
          [exec] +1 @author. The patch does not contain any @author tags.
          [exec]
          [exec] +1 tests included. The patch appears to include 3 new or modified tests.
          [exec]
          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
          [exec]
          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
          [exec]
          [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
          [exec]
          [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
          [exec]
          [exec] +1 system tests framework. The patch passed system tests framework compile.

          Show
          Ravi Gummadi added a comment - Unit tests passed on my local machine. ant test-patch gave: [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 system tests framework. The patch passed system tests framework compile.
          Hide
          Amar Kamat added a comment -

          +1. Looks good to me.

          Show
          Amar Kamat added a comment - +1. Looks good to me.
          Hide
          Ravi Gummadi added a comment -

          Attaching new patch incorporating offline review comments from Amar and Ranjit.

          Show
          Ravi Gummadi added a comment - Attaching new patch incorporating offline review comments from Amar and Ranjit.
          Hide
          Ravi Gummadi added a comment -

          -r is not descriptive enough compared to -recursive. May be later we could think of supporting "-R" in addition to -recursive, if needed.

          Show
          Ravi Gummadi added a comment - -r is not descriptive enough compared to -recursive. May be later we could think of supporting "-R" in addition to -recursive, if needed.
          Hide
          Vinay Kumar Thota added a comment -

          Patch looks good to me. However, why don't you use -r option instead of -recursive? In general -r represents the recursive mode right.

          Show
          Vinay Kumar Thota added a comment - Patch looks good to me. However, why don't you use -r option instead of -recursive? In general -r represents the recursive mode right.
          Hide
          Ravi Gummadi added a comment -

          Attaching patch that adds the option "-recursive" to TraceBuilder.

          With -recursive option, TraceBuilder generates trace by scanning all the job history logs recursively under the given path.

          Please review the patch and provide your comments.

          Show
          Ravi Gummadi added a comment - Attaching patch that adds the option "-recursive" to TraceBuilder. With -recursive option, TraceBuilder generates trace by scanning all the job history logs recursively under the given path. Please review the patch and provide your comments.
          Hide
          Ranjit Mathew added a comment -

          The need to fix this issue will become even stronger once MAPREDUCE-323 is fixed.

          Show
          Ranjit Mathew added a comment - The need to fix this issue will become even stronger once MAPREDUCE-323 is fixed.

            People

            • Assignee:
              Ravi Gummadi
              Reporter:
              Amar Kamat
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development