Hadoop Common
  1. Hadoop Common
  2. HADOOP-7539

merge hadoop archive goodness from trunk to .20

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.205.0
    • Fix Version/s: 0.20.205.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      hadoop archive in branch-0.20-security is outdated. When run recently, it produced some bugs which were all fixed in trunk. This JIRA aims to bring in all these JIRAs to branch-0.20-security.

        Activity

        Hide
        Matt Foley added a comment -

        Closed upon release of 0.20.205.0

        Show
        Matt Foley added a comment - Closed upon release of 0.20.205.0
        Hide
        Mahadev konar added a comment -

        I just committed this. Thanks a lot John!

        Show
        Mahadev konar added a comment - I just committed this. Thanks a lot John!
        Hide
        Mahadev konar added a comment -

        looks good to me. Ill run some ant tests and check it in the 0.20 security branch.

        Show
        Mahadev konar added a comment - looks good to me. Ill run some ant tests and check it in the 0.20 security branch.
        Hide
        John George added a comment -

        1. Create HAR file using version 1

        $ hadoop fs -cat /tmp/thisis1.har/_masterindex
        1
        0 2127535165 0 1856

        2. Install version 3 of HAR

        $ hadoop fs -cat /tmp/thisis3.har/_masterindex
        3
        0 2127535165 0 2610

        3. Run ls and wordcount on VERSION 1

        $ hadoop fs -ls har:///tmp/thisis1.har
        $ hadoop jar hadoop-examples.jar wordcount har:///tmp/thisis1.har/x.sh /tmp/out.2

        Show
        John George added a comment - 1. Create HAR file using version 1 $ hadoop fs -cat /tmp/thisis1.har/_masterindex 1 0 2127535165 0 1856 2. Install version 3 of HAR $ hadoop fs -cat /tmp/thisis3.har/_masterindex 3 0 2127535165 0 2610 3. Run ls and wordcount on VERSION 1 $ hadoop fs -ls har:///tmp/thisis1.har $ hadoop jar hadoop-examples.jar wordcount har:///tmp/thisis1.har/x.sh /tmp/out.2
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12490263/HADOOP-7539-1.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 6 new or modified tests.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/67//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12490263/HADOOP-7539-1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/67//console This message is automatically generated.
        Hide
        Mahadev konar added a comment -

        Looks like I might be wrong. The patch seems to be able to read the old har archives as well. John, mind testing it out?

        Show
        Mahadev konar added a comment - Looks like I might be wrong. The patch seems to be able to read the old har archives as well. John, mind testing it out?
        Hide
        Mahadev konar added a comment -

        Maybe we want to add a utility to upconvert from 1 to 3 version?

        Show
        Mahadev konar added a comment - Maybe we want to add a utility to upconvert from 1 to 3 version?
        Hide
        Mahadev konar added a comment -

        The only issue I see is that hadoop archives that already existed on the cluster will become obsolete since the new archive code wont be able to read it?

        Show
        Mahadev konar added a comment - The only issue I see is that hadoop archives that already existed on the cluster will become obsolete since the new archive code wont be able to read it?
        Hide
        John George added a comment -

        Manual tests run:

        • created a har file as follows:
        • hadoop fs -put test /tmp
        • hadoop archive -archiveName test.har -p /tmp test /tmp
        • ran the following manual tests:
        • wordcount on a couple of har files
        • streaming on the same har file with: hadoop jar hadoop-streaming.jar -Dmapred.reduce.tasks=1 -input har:///tmp/test.har/test/aa -output /tmp/aaa.2 -mapper cat -reducer "wc -l"

        Both of the above jobs completed successfully and had outputs in the corresponding output directory.

        Show
        John George added a comment - Manual tests run: created a har file as follows: hadoop fs -put test /tmp hadoop archive -archiveName test.har -p /tmp test /tmp ran the following manual tests: wordcount on a couple of har files streaming on the same har file with: hadoop jar hadoop-streaming.jar -Dmapred.reduce.tasks=1 -input har:///tmp/test.har/test/aa -output /tmp/aaa.2 -mapper cat -reducer "wc -l" Both of the above jobs completed successfully and had outputs in the corresponding output directory.
        Hide
        John George added a comment -

        Yes, I will run the manual testing and post the results here.

        I ran "ant test" and it failed the same test that failed without the patch. The results of test-patch is as follows:

        [exec] BUILD SUCCESSFUL
        [exec] Total time: 6 minutes 23 seconds
        [exec]
        [exec]
        [exec]
        [exec]
        [exec] +1 overall.
        [exec]
        [exec] +1 @author. The patch does not contain any @author tags.
        [exec]
        [exec] +1 tests included. The patch appears to include 6 new or modified tests.
        [exec]
        [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
        [exec]
        [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
        [exec]
        [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
        [exec]
        [exec]
        [exec]
        [exec]
        [exec] ======================================================================
        [exec] ======================================================================
        [exec] Finished build.
        [exec] ======================================================================
        [exec] ======================================================================

        Show
        John George added a comment - Yes, I will run the manual testing and post the results here. I ran "ant test" and it failed the same test that failed without the patch. The results of test-patch is as follows: [exec] BUILD SUCCESSFUL [exec] Total time: 6 minutes 23 seconds [exec] [exec] [exec] [exec] [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 6 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. [exec] [exec] [exec] [exec] [exec] ====================================================================== [exec] ====================================================================== [exec] Finished build. [exec] ====================================================================== [exec] ======================================================================
        Hide
        Mahadev konar added a comment -

        John,
        Since this is a big patch, can you please do some manual testing on a real cluster (could be a single node cluster)? Just run a archive job and then a map reduce job to use the archives as input and verify the results. That should suffice.

        Show
        Mahadev konar added a comment - John, Since this is a big patch, can you please do some manual testing on a real cluster (could be a single node cluster)? Just run a archive job and then a map reduce job to use the archives as input and verify the results. That should suffice.
        Hide
        John George added a comment -

        Sorry Owen, I meant to say branch-20-security (not branch-0.20). Fixed "Description". The patch is also meant for branch-.20-security.

        Show
        John George added a comment - Sorry Owen, I meant to say branch-20-security (not branch-0.20). Fixed "Description". The patch is also meant for branch-.20-security.
        Hide
        Owen O'Malley added a comment -

        No one has proposed making any more releases out of branch-0.20. Can you generate a patch for the branch-0.20-security line?

        Show
        Owen O'Malley added a comment - No one has proposed making any more releases out of branch-0.20. Can you generate a patch for the branch-0.20-security line?
        Hide
        John George added a comment -

        The following JIRAs were the most interesting ones, but it made sense to bring in most of the others as well, not only because a bunch of them are dependencies of the JIRAs that were needed, but also because it is easier to merge.

        MAPREDUCE-1425 :archive throws OutOfMemoryError
        MAPREDUCE-2317 :HadoopArchives throwing NullPointerException while creating hadoop archives
        MAPREDUCE-1399 : The archive command shows a null error message
        MAPREDUCE-1752 :Implement getFileBlockLocations in HarFilesystem

        Show
        John George added a comment - The following JIRAs were the most interesting ones, but it made sense to bring in most of the others as well, not only because a bunch of them are dependencies of the JIRAs that were needed, but also because it is easier to merge. MAPREDUCE-1425 :archive throws OutOfMemoryError MAPREDUCE-2317 :HadoopArchives throwing NullPointerException while creating hadoop archives MAPREDUCE-1399 : The archive command shows a null error message MAPREDUCE-1752 :Implement getFileBlockLocations in HarFilesystem

          People

          • Assignee:
            John George
            Reporter:
            John George
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development