Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1782

FSNamesystem.startFileInternal(..) throws NullPointerException

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.22.0
    • Fix Version/s: 0.22.0, 0.23.0
    • Component/s: namenode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      I'm observing when there is one balancer running trying to run another one results in
      "Java.lang.NullPointerException" error. I was hoping to see message "Another balancer is running.
      Exiting.... Exiting ...". This is a reproducible issue.

      Details
      ========

      1) Cluster ->elrond

      [hdfs@]$ hadoop version

      2) Run first balancer
      [hdfs]$ hdfs balancer
      1
      through XX.XX.XX.XX:1004 is succeeded.
      [hdfs@]$ hdfs balancer
      11/03/09 16:34:32 INFO balancer.Balancer: namenodes =
      java.io.IOException: java.lang.NullPointerException
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1400)
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1284)
      at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:779)
      at sun.reflect.GeneratedMethodAccessor46.invoke(Unknown Source)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:346)
      at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1399)
      at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1395)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:396)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1094)
      at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1393)
      . Exiting ...
      Balancing took 1.366 seconds

      1. HDFS-1782.patch
        0.8 kB
        John George

        Activity

        Arun C Murthy made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-22-branch #35 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-22-branch/35/)

        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-22-branch #35 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-22-branch/35/ )
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk #643 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/)

        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #643 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/ )
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk-Commit #582 (See https://hudson.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/582/)

        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #582 (See https://hudson.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/582/ )
        Tsz Wo Nicholas Sze made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        Tsz Wo Nicholas Sze added a comment -

        I have committed this. Thanks, John!

        Show
        Tsz Wo Nicholas Sze added a comment - I have committed this. Thanks, John!
        Tsz Wo Nicholas Sze made changes -
        Hadoop Flags [Reviewed]
        Fix Version/s 0.23.0 [ 12315571 ]
        Component/s name-node [ 12312926 ]
        Hide
        Tsz Wo Nicholas Sze added a comment -

        +1 patch looks good.

        Show
        Tsz Wo Nicholas Sze added a comment - +1 patch looks good.
        John George made changes -
        Description I'm observing when there is one balancer running trying to run another one results in
        "Java.lang.NullPointerException" error. I was hoping to see message "Another balancer is running.
        Exiting.... Exiting ...". This is a reproducible issue.

        Details
        ========

        1) Cluster ->elrond

        [hdfs@gsbl90568 smilli]$ hadoop version
        Hadoop 0.22.0.1102280202
        Subversion git://hadoopre5.corp.sk1.yahoo.com/home/y/var/builds/thread2/workspace/Cloud-HadoopCOMMON-0.22-Secondary -r
        c7c9a21d7289e29f0133452acf8b761e455a84b5
        Compiled by hadoopqa on Mon Feb 28 02:12:38 PST 2011
        From source with checksum 9ecbc6f17e8847a1cddca2282dbd9b31
        [hdfs@gsbl90568 smilli]$


        2) Run first balancer
        [hdfs@gsbl90565 smilli]$ hdfs balancer
        11/03/09 16:33:56 INFO balancer.Balancer: namenodes = [gsbl90565.blue.ygrid.yahoo.com/98.137.97.57:8020,
        gsbl90569.blue.ygrid.yahoo.com/98.137.97.53:8020]
        11/03/09 16:33:56 INFO balancer.Balancer: p = Balancer.Parameters[BalancingPolicy.Node, threshold=10.0]
        Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved
        11/03/09 16:33:57 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
        11/03/09 16:33:57 INFO balancer.Balancer: Block token params received from NN: keyUpdateInterval=600 min(s),
        tokenLifetime=600 min(s)
        11/03/09 16:33:57 INFO block.BlockTokenSecretManager: Setting block keys
        11/03/09 16:33:57 INFO balancer.Balancer: Balancer will update its block keys every 150 minute(s)
        11/03/09 16:33:57 INFO block.BlockTokenSecretManager: Setting block keys
        11/03/09 16:33:57 INFO balancer.Balancer: Block token params received from NN: keyUpdateInterval=600 min(s),
        tokenLifetime=600 min(s)
        11/03/09 16:33:57 INFO block.BlockTokenSecretManager: Setting block keys
        11/03/09 16:33:57 INFO balancer.Balancer: Balancer will update its block keys every 150 minute(s)
        11/03/09 16:33:57 INFO block.BlockTokenSecretManager: Setting block keys
        11/03/09 16:33:57 INFO net.NetworkTopology: Adding a new node: /98.137.97.0/98.137.97.62:1004
        11/03/09 16:33:57 INFO net.NetworkTopology: Adding a new node: /98.137.97.0/98.137.97.58:1004
        11/03/09 16:33:57 INFO net.NetworkTopology: Adding a new node: /98.137.97.0/98.137.97.60:1004
        11/03/09 16:33:57 INFO net.NetworkTopology: Adding a new node: /98.137.97.0/98.137.97.59:1004
        11/03/09 16:33:57 INFO balancer.Balancer: 1 over-utilized: [Source[98.137.97.62:1004, utilization=24.152507825759344]]
        11/03/09 16:33:57 INFO balancer.Balancer: 0 underutilized: []
        11/03/09 16:33:57 INFO balancer.Balancer: Need to move 207.98 GB to make the cluster balanced.
        11/03/09 16:33:57 INFO balancer.Balancer: Decided to move 10 GB bytes from 98.137.97.62:1004 to 98.137.97.58:1004
        11/03/09 16:33:57 INFO balancer.Balancer: Will move 10 GB in this iteration
        Mar 9, 2011 4:33:57 PM 0 0 KB 207.98 GB 10 GB



        .
        .
        .
        11/03/09 16:34:36 INFO balancer.Balancer: Moving block -63570336576981940 from 98.137.97.62:1004 to 98.137.97.59:1004
        through 98.137.97.62:1004 is succeeded.
        11/03/09 16:34:39 INFO balancer.Balancer: Moving block 2379736326585824737 from 98.137.97.62:1004 to 98.137.97.59:1004
        through 98.137.97.62:1004 is succeeded.
        11/03/09 16:35:21 INFO balancer.Balancer: Moving block 8884583953927078028 from 98.137.97.62:1004 to 98.137.97.59:1004
        through 98.137.97.62:1004 is succeeded.
        11/03/09 16:35:24 INFO balancer.Balancer: Moving block -135758138424743964 from 98.137.97.62:1004 to 98.137.97.59:1004
        through 98.137.97.62:1004 is succeeded.
        11/03/09 16:35:27 INFO balancer.Balancer: Moving block -4598153351946352185 from 98.137.97.62:1004 to 98.137.97.59:1004
        through 98.137.97.62:1004 is succeeded.
        11/03/09 16:35:33 INFO balancer.Balancer: Moving block 2966087210491094643 from 98.137.97.62:1004 to 98.137.97.59:1004
        through 98.137.97.62:1004 is succeeded.
        11/03/09 16:35:42 INFO balancer.Balancer: Moving block -5573983508500804184 from 98.137.97.62:1004 to 98.137.97.59:1004
        through 98.137.97.62:1004 is succeeded.
        11/03/09 16:35:58 INFO balancer.Balancer: Moving block -6222779741597113957 from 98.137.97.62:1004 to 98.137.97.59:1004
        through 98.137.97.62:1004 is succeeded.







        3) Run another balancer observe
        [hdfs@gsbl90568 smilli]$ hdfs balancer
        11/03/09 16:34:32 INFO balancer.Balancer: namenodes = [gsbl90565.blue.ygrid.yahoo.com/98.137.97.57:8020,
        gsbl90569.blue.ygrid.yahoo.com/98.137.97.53:8020]
        11/03/09 16:34:32 INFO balancer.Balancer: p = Balancer.Parameters[BalancingPolicy.Node, threshold=10.0]
        Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved
        11/03/09 16:34:33 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
        11/03/09 16:34:33 INFO balancer.Balancer: Block token params received from NN: keyUpdateInterval=600 min(s),
        tokenLifetime=600 min(s)
        11/03/09 16:34:33 INFO block.BlockTokenSecretManager: Setting block keys
        11/03/09 16:34:33 INFO balancer.Balancer: Balancer will update its block keys every 150 minute(s)
        11/03/09 16:34:33 INFO block.BlockTokenSecretManager: Setting block keys
        java.io.IOException: java.lang.NullPointerException
                at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1400)
                at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1284)
                at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:779)
                at sun.reflect.GeneratedMethodAccessor46.invoke(Unknown Source)
                at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
                at java.lang.reflect.Method.invoke(Method.java:597)
                at org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:346)
                at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1399)
                at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1395)
                at java.security.AccessController.doPrivileged(Native Method)
                at javax.security.auth.Subject.doAs(Subject.java:396)
                at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1094)
                at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1393)
        . Exiting ...
        Balancing took 1.366 seconds
        [hdfs@gsbl90568 smilli]$


        Pls let me know if you need additional information.

        Thanks,
        Suma


        I'm observing when there is one balancer running trying to run another one results in
        "Java.lang.NullPointerException" error. I was hoping to see message "Another balancer is running.
        Exiting.... Exiting ...". This is a reproducible issue.

        Details
        ========

        1) Cluster ->elrond

        [hdfs@]$ hadoop version

        2) Run first balancer
        [hdfs]$ hdfs balancer
        1
        through XX.XX.XX.XX:1004 is succeeded.
        [hdfs@]$ hdfs balancer
        11/03/09 16:34:32 INFO balancer.Balancer: namenodes =
        java.io.IOException: java.lang.NullPointerException
                at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1400)
                at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1284)
                at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:779)
                at sun.reflect.GeneratedMethodAccessor46.invoke(Unknown Source)
                at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
                at java.lang.reflect.Method.invoke(Method.java:597)
                at org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:346)
                at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1399)
                at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1395)
                at java.security.AccessController.doPrivileged(Native Method)
                at javax.security.auth.Subject.doAs(Subject.java:396)
                at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1094)
                at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1393)
        . Exiting ...
        Balancing took 1.366 seconds
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12474821/HDFS-1782.patch
        against trunk revision 1086820.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these core unit tests:
        org.apache.hadoop.cli.TestHDFSCLI
        org.apache.hadoop.hdfs.server.datanode.TestTransferRbw
        org.apache.hadoop.hdfs.TestDFSShell
        org.apache.hadoop.hdfs.TestFileConcurrentReader

        -1 contrib tests. The patch failed contrib unit tests.

        +1 system test framework. The patch passed system test framework compile.

        Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/305//testReport/
        Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/305//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/305//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12474821/HDFS-1782.patch against trunk revision 1086820. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestHDFSCLI org.apache.hadoop.hdfs.server.datanode.TestTransferRbw org.apache.hadoop.hdfs.TestDFSShell org.apache.hadoop.hdfs.TestFileConcurrentReader -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/305//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/305//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/305//console This message is automatically generated.
        John George made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        John George added a comment -

        This was caused because of a null pointer deference. The change was tested by the same person who filed this bug and it passed.

        I thought about adding a test to run multiple balancers to as part of the unit test, but the timing of balancers does not guarantee that this will always test the change that was put in. Adding a "mock" test was a possibility, but it seems to me that it requires source code change to be able to "mock" and test the particular change.

        [exec] BUILD SUCCESSFUL
        [exec] Total time: 49 seconds
        [exec]
        [exec]
        [exec]
        [exec]
        [exec] -1 overall.
        [exec]
        [exec] +1 @author. The patch does not contain any @author tags.
        [exec]
        [exec] -1 tests included. The patch doesn't appear to include any new or modified tests.
        [exec] Please justify why no new tests are needed for this patch.
        [exec] Also please list what manual steps were performed to verify this patch.
        [exec]
        [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
        [exec]
        [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
        [exec]
        [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
        [exec]
        [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
        [exec]
        [exec] +1 system test framework. The patch passed system test framework compile.
        [exec]

        Show
        John George added a comment - This was caused because of a null pointer deference. The change was tested by the same person who filed this bug and it passed. I thought about adding a test to run multiple balancers to as part of the unit test, but the timing of balancers does not guarantee that this will always test the change that was put in. Adding a "mock" test was a possibility, but it seems to me that it requires source code change to be able to "mock" and test the particular change. [exec] BUILD SUCCESSFUL [exec] Total time: 49 seconds [exec] [exec] [exec] [exec] [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no new tests are needed for this patch. [exec] Also please list what manual steps were performed to verify this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 system test framework. The patch passed system test framework compile. [exec]
        John George made changes -
        Field Original Value New Value
        Attachment HDFS-1782.patch [ 12474821 ]
        John George created issue -

          People

          • Assignee:
            John George
            Reporter:
            John George
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development