Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2317

HadoopArchives throwing NullPointerException while creating hadoop archives (.har files)

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.20.3
    • Fix Version/s: 0.21.1, 0.22.0, 0.23.0
    • Component/s: harchive
    • Labels:
      None
    • Environment:

      windows

    • Hadoop Flags:
      Reviewed

      Description

      While we are trying to run hadoop archive tool in widows using this way, it is giving the below exception.

      java org.apache.hadoop.tools.HadoopArchives -archiveName temp.har D:/test/in E:/temp

       
      
      java.lang.NullPointerException
      	at org.apache.hadoop.tools.HadoopArchives.writeTopLevelDirs(HadoopArchives.java:320)
      	at org.apache.hadoop.tools.HadoopArchives.archive(HadoopArchives.java:386)
      	at org.apache.hadoop.tools.HadoopArchives.run(HadoopArchives.java:725)
      	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
      	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
      	at org.apache.hadoop.tools.HadoopArchives.main(HadoopArchives.java:739)
      
      

      I see the code flow to handle this feature in windows also,

      Path.java
      
      /** Returns the parent of a path or null if at root. */
        public Path getParent() {
          String path = uri.getPath();
          int lastSlash = path.lastIndexOf('/');
          int start = hasWindowsDrive(path, true) ? 3 : 0;
          if ((path.length() == start) ||               // empty path
              (lastSlash == start && path.length() == start+1)) { // at root
            return null;
          }
          String parent;
          if (lastSlash==-1) {
            parent = CUR_DIR;
          } else {
            int end = hasWindowsDrive(path, true) ? 3 : 0;
            parent = path.substring(0, lastSlash==end?end+1:lastSlash);
          }
          return new Path(uri.getScheme(), uri.getAuthority(), parent);
        }
      
      
      1. MAPREDUCE-2317-trunk.patch
        1 kB
        Devaraj K
      2. MAPREDUCE-2317-0.20.patch
        1 kB
        Devaraj K
      3. MAPREDUCE-2317.patch
        1 kB
        Devaraj K

        Activity

        Hide
        Tsz Wo Nicholas Sze added a comment -
        • When parent == null, should it be not added to parents list?
        • It seems that you are not using the latest trunk.
        Show
        Tsz Wo Nicholas Sze added a comment - When parent == null , should it be not added to parents list? It seems that you are not using the latest trunk.
        Hide
        Devaraj K added a comment -

        This problem is not there in the latest trunk. It is coming in 0.20.3 branch. Patch is provided for that.

        Show
        Devaraj K added a comment - This problem is not there in the latest trunk. It is coming in 0.20.3 branch. Patch is provided for that.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Hi Devaraj, if the problem is not in the latest trunk, is it already fixed by some other JIRAs? Do you know which JIRAs have fixed it?

        Show
        Tsz Wo Nicholas Sze added a comment - Hi Devaraj, if the problem is not in the latest trunk, is it already fixed by some other JIRAs? Do you know which JIRAs have fixed it?
        Hide
        Devaraj K added a comment -

        Hi Nicholas, I couldnot find the jira id, but the problem is rectified in the trunk.

        Show
        Devaraj K added a comment - Hi Nicholas, I couldnot find the jira id, but the problem is rectified in the trunk.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        For 0.20, is it the same that when parent == null, it should not be added to parents list?

        Show
        Tsz Wo Nicholas Sze added a comment - For 0.20, is it the same that when parent == null , it should not be added to parents list?
        Hide
        Mahadev konar added a comment -

        devaraj,
        the jira says that the bug affects 0.23 and the fix is for 0.23. But you mentioned above that the trunk doesnt have this issue? Do you intend to close the jira as resolved?

        Show
        Mahadev konar added a comment - devaraj, the jira says that the bug affects 0.23 and the fix is for 0.23. But you mentioned above that the trunk doesnt have this issue? Do you intend to close the jira as resolved?
        Hide
        Devaraj K added a comment -

        I have mistakenly selected the patch version as 0.23, but it is intended for 0.20.3. It would be better if we commit these changes in 0.20.3.

        Show
        Devaraj K added a comment - I have mistakenly selected the patch version as 0.23, but it is intended for 0.20.3. It would be better if we commit these changes in 0.20.3.
        Hide
        Devaraj K added a comment -

        Thanks for reviewing.

        @Nicholas:

        is it the same that when parent == null, it should not be added to parents list?

        Yes.

        If it is unix system, for root dir it will go here and do nothing.

         
            Path root = new Path(Path.SEPARATOR);
            for (int i = 0; i < deepest.depth(); i++) {
              List<Path> parents = new ArrayList<Path>();
              for (Path p: justDirs) {
                if (p.compareTo(root) == 0){
                  //don nothing
                }
        

        For windows system, when p is windows root dir (i.e. like C: ), it will go in else and parent will come as null, and causes NullPointerException.

                else {
                  Path parent = p.getParent();
                  if (allpaths.containsKey(parent.toString())) {
                    HashSet<String> children = allpaths.get(parent.toString());
                    children.add(p.getName());
                  }
        
        Show
        Devaraj K added a comment - Thanks for reviewing. @Nicholas: is it the same that when parent == null, it should not be added to parents list? Yes. If it is unix system, for root dir it will go here and do nothing. Path root = new Path(Path.SEPARATOR); for (int i = 0; i < deepest.depth(); i++) { List <Path> parents = new ArrayList <Path> (); for (Path p: justDirs) { if (p.compareTo(root) == 0){ //don nothing } For windows system, when p is windows root dir (i.e. like C: ), it will go in else and parent will come as null, and causes NullPointerException. else { Path parent = p.getParent(); if (allpaths.containsKey(parent.toString())) { HashSet <String> children = allpaths.get(parent.toString()); children.add(p.getName()); }
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Let me clarify my question: after the patch, the codes look like

        Path parent = p.getParent();
        if(null != parent) {
          ...
        }
        parents.add(parent);
        

        If parent == null, we add it to parents. Should it not be added? Otherwise, I think we will get NPE later.

        Show
        Tsz Wo Nicholas Sze added a comment - Let me clarify my question: after the patch, the codes look like Path parent = p.getParent(); if ( null != parent) { ... } parents.add(parent); If parent == null , we add it to parents . Should it not be added? Otherwise, I think we will get NPE later.
        Hide
        Devaraj K added a comment -

        Sorry for late reply.

        parents is assigning to justDirs and justDirs used in the for loop only. When null gets added to parents loop ends because i reaches to deepest.depth().

        If we move parents.add(parent); inside null check or if we keep outside also doesnot give any functionality difference.

        Show
        Devaraj K added a comment - Sorry for late reply. parents is assigning to justDirs and justDirs used in the for loop only. When null gets added to parents loop ends because i reaches to deepest.depth(). If we move parents.add(parent); inside null check or if we keep outside also doesnot give any functionality difference.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Hi Devaraj, thanks for the explanation of the codes. I really like the following comment in the codes.

        // this is tricky
        

        Still, how about moving parents.add(parent) inside null check for easing future maintenance?

        Since the 0.23 have this part of codes. let's also commit it.

        Show
        Tsz Wo Nicholas Sze added a comment - Hi Devaraj, thanks for the explanation of the codes. I really like the following comment in the codes. // this is tricky Still, how about moving parents.add(parent) inside null check for easing future maintenance? Since the 0.23 have this part of codes. let's also commit it.
        Hide
        Devaraj K added a comment -

        Hi Nicholas, I have updated the patch and also provided patch as per above comments.

        Show
        Devaraj K added a comment - Hi Nicholas, I have updated the patch and also provided patch as per above comments.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        +1 patch look good. Thanks a lot!

        Show
        Tsz Wo Nicholas Sze added a comment - +1 patch look good. Thanks a lot!
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12477114/MAPREDUCE-2317-trunk.patch
        against trunk revision 1094093.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        +1 system test framework. The patch passed system test framework compile.

        Test results: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/179//testReport/
        Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/179//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/179//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12477114/MAPREDUCE-2317-trunk.patch against trunk revision 1094093. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/179//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/179//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/179//console This message is automatically generated.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        The failed tests are not related; see MAPREDUCE-2448.

        This is a simple change. Devaraj has tested it manually. No new tests needed.

        I have committed this. Thanks, Devaraj!

        Show
        Tsz Wo Nicholas Sze added a comment - The failed tests are not related; see MAPREDUCE-2448 . This is a simple change. Devaraj has tested it manually. No new tests needed. I have committed this. Thanks, Devaraj!
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk-Commit #645 (See https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/645/)

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #645 (See https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/645/ )
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-22-branch #42 (See https://builds.apache.org/hudson/job/Hadoop-Mapreduce-22-branch/42/)

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-22-branch #42 (See https://builds.apache.org/hudson/job/Hadoop-Mapreduce-22-branch/42/ )
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #669 (See https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk/669/)

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #669 (See https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk/669/ )

          People

          • Assignee:
            Devaraj K
            Reporter:
            Devaraj K
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development