Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-3399 BookKeeper option support for NN HA
  3. HDFS-3423

BKJM: NN startup is failing, when tries to recoverUnfinalizedSegments() a bad inProgress_ ZNodes

    Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0
    • Fix Version/s: 3.0.0, 2.0.2-alpha
    • Component/s: None
    • Labels:
      None

      Description

      Say, the InProgress_000X node is corrupted due to not writing the data(version, ledgerId, firstTxId) to this inProgress_000X znode. Namenode startup has the logic to recover all the unfinalized segments, here will try to read the segment and getting shutdown.

      EditLogLedgerMetadata.java:
      
      static EditLogLedgerMetadata read(ZooKeeper zkc, String path)
            throws IOException, KeeperException.NoNodeException  {
            byte[] data = zkc.getData(path, false, null);
            String[] parts = new String(data).split(";");
            if (parts.length == 3)
               ....reading inprogress metadata
            else if (parts.length == 4)
               ....reading inprogress metadata
            else
              throw new IOException("Invalid ledger entry, "
                                    + new String(data));
            }
      

      Scenario:- Leaving bad inProgress_000X node ?
      Assume BKJM has created the inProgress_000X zNode and ZK is not available when trying to add the metadata. Now, inProgress_000X ends up with partial information.

      1. HDFS-3423.diff
        20 kB
        Ivan Kelly
      2. HDFS-3423.diff
        20 kB
        Ivan Kelly
      3. HDFS-3423.diff
        20 kB
        Ivan Kelly
      4. HDFS-3423.patch
        21 kB
        Uma Maheswara Rao G
      5. HDFS-3423.patch
        21 kB
        Uma Maheswara Rao G

        Issue Links

          Activity

          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #1097 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1097/)
          HDFS-3423. BKJM: NN startup is failing, when tries to recoverUnfinalizedSegments() a bad inProgress_ ZNodes. Contributed by Ivan and Uma. (Revision 1344840)

          Result = SUCCESS
          umamahesh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1344840
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperEditLogInputStream.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperJournalManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/EditLogLedgerMetadata.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/MaxTxId.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperEditLogStreams.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperJournalManager.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1097 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1097/ ) HDFS-3423 . BKJM: NN startup is failing, when tries to recoverUnfinalizedSegments() a bad inProgress_ ZNodes. Contributed by Ivan and Uma. (Revision 1344840) Result = SUCCESS umamahesh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1344840 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperEditLogInputStream.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperJournalManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/EditLogLedgerMetadata.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/MaxTxId.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperEditLogStreams.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperJournalManager.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #1063 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1063/)
          HDFS-3423. BKJM: NN startup is failing, when tries to recoverUnfinalizedSegments() a bad inProgress_ ZNodes. Contributed by Ivan and Uma. (Revision 1344840)

          Result = SUCCESS
          umamahesh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1344840
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperEditLogInputStream.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperJournalManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/EditLogLedgerMetadata.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/MaxTxId.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperEditLogStreams.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperJournalManager.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1063 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1063/ ) HDFS-3423 . BKJM: NN startup is failing, when tries to recoverUnfinalizedSegments() a bad inProgress_ ZNodes. Contributed by Ivan and Uma. (Revision 1344840) Result = SUCCESS umamahesh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1344840 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperEditLogInputStream.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperJournalManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/EditLogLedgerMetadata.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/MaxTxId.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperEditLogStreams.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperJournalManager.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #2378 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2378/)
          HDFS-3423. BKJM: NN startup is failing, when tries to recoverUnfinalizedSegments() a bad inProgress_ ZNodes. Contributed by Ivan and Uma. (Revision 1344840)

          Result = SUCCESS
          umamahesh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1344840
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperEditLogInputStream.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperJournalManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/EditLogLedgerMetadata.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/MaxTxId.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperEditLogStreams.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperJournalManager.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #2378 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2378/ ) HDFS-3423 . BKJM: NN startup is failing, when tries to recoverUnfinalizedSegments() a bad inProgress_ ZNodes. Contributed by Ivan and Uma. (Revision 1344840) Result = SUCCESS umamahesh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1344840 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperEditLogInputStream.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperJournalManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/EditLogLedgerMetadata.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/MaxTxId.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperEditLogStreams.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperJournalManager.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #2306 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2306/)
          HDFS-3423. BKJM: NN startup is failing, when tries to recoverUnfinalizedSegments() a bad inProgress_ ZNodes. Contributed by Ivan and Uma. (Revision 1344840)

          Result = SUCCESS
          umamahesh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1344840
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperEditLogInputStream.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperJournalManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/EditLogLedgerMetadata.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/MaxTxId.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperEditLogStreams.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperJournalManager.java
          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #2306 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2306/ ) HDFS-3423 . BKJM: NN startup is failing, when tries to recoverUnfinalizedSegments() a bad inProgress_ ZNodes. Contributed by Ivan and Uma. (Revision 1344840) Result = SUCCESS umamahesh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1344840 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperEditLogInputStream.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperJournalManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/EditLogLedgerMetadata.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/MaxTxId.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperEditLogStreams.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperJournalManager.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #2323 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2323/)
          HDFS-3423. BKJM: NN startup is failing, when tries to recoverUnfinalizedSegments() a bad inProgress_ ZNodes. Contributed by Ivan and Uma. (Revision 1344840)

          Result = FAILURE
          umamahesh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1344840
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperEditLogInputStream.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperJournalManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/EditLogLedgerMetadata.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/MaxTxId.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperEditLogStreams.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperJournalManager.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #2323 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2323/ ) HDFS-3423 . BKJM: NN startup is failing, when tries to recoverUnfinalizedSegments() a bad inProgress_ ZNodes. Contributed by Ivan and Uma. (Revision 1344840) Result = FAILURE umamahesh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1344840 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperEditLogInputStream.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperJournalManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/EditLogLedgerMetadata.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/MaxTxId.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperEditLogStreams.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperJournalManager.java
          Hide
          Uma Maheswara Rao G added a comment -

          I have just committed this to trunk and branch-2. Thanks a lot, Ivan.

          Show
          Uma Maheswara Rao G added a comment - I have just committed this to trunk and branch-2. Thanks a lot, Ivan.
          Hide
          Ivan Kelly added a comment -

          lgtm +1.

          Show
          Ivan Kelly added a comment - lgtm +1.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12530371/HDFS-3423.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2551//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2551//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12530371/HDFS-3423.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2551//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2551//console This message is automatically generated.
          Hide
          Uma Maheswara Rao G added a comment -

          Ivan, Please take a look, if I miss anything.
          Thanks a lot.

          Show
          Uma Maheswara Rao G added a comment - Ivan, Please take a look, if I miss anything. Thanks a lot.
          Hide
          Uma Maheswara Rao G added a comment -

          oh, I have refactored the reset method using eclipse. It is automatically adding that . Infact that is alrady an IOE, and method also throws IOE. I think, I can remove that, I will upload it again. Thanks Ivan.

          Show
          Uma Maheswara Rao G added a comment - oh, I have refactored the reset method using eclipse. It is automatically adding that . Infact that is alrady an IOE, and method also throws IOE. I think, I can remove that, I will upload it again. Thanks Ivan.
          Hide
          Ivan Kelly added a comment -

          Generally looked good to me.
          Why did you add UnsupportedEncodingException in MaxTxId?

          Show
          Ivan Kelly added a comment - Generally looked good to me. Why did you add UnsupportedEncodingException in MaxTxId?
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12530358/HDFS-3423.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2549//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2549//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12530358/HDFS-3423.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2549//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2549//console This message is automatically generated.
          Hide
          Uma Maheswara Rao G added a comment -

          Ok, I have resolved the conflicts with HDFS-3474. And also cleaned some very minor stuff.
          + Resolved conflicts with HDFS-3474
          + removed unused variables from TestBookKeeperJournalManager
          + removed unused imports
          import org.junit.Before;
          import org.junit.After;
          + bkjmutil.connectZooKeeper(); --> BKJMUtil.connectZooKeeper();
          + formatted hashcode method in EditLogLedgerMetadata

          +1, Pending jenkins report.

          Show
          Uma Maheswara Rao G added a comment - Ok, I have resolved the conflicts with HDFS-3474 . And also cleaned some very minor stuff. + Resolved conflicts with HDFS-3474 + removed unused variables from TestBookKeeperJournalManager + removed unused imports import org.junit.Before; import org.junit.After; + bkjmutil.connectZooKeeper(); --> BKJMUtil.connectZooKeeper(); + formatted hashcode method in EditLogLedgerMetadata +1, Pending jenkins report.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12530187/HDFS-3423.diff
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2542//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2542//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12530187/HDFS-3423.diff against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2542//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2542//console This message is automatically generated.
          Hide
          Ivan Kelly added a comment -

          Rebased on trunk. This patch clashes with HDFS-3474, so which ever goes in first, the other will have to rebased.

          Show
          Ivan Kelly added a comment - Rebased on trunk. This patch clashes with HDFS-3474 , so which ever goes in first, the other will have to rebased.
          Hide
          Rakesh R added a comment -

          Oh, I didn't consider one flow like after the edit log conversion, immediately #store failed. Now again in the next trip of finalization, it will throw NodeExistsException, here it is really required to do the #store. I misunderstood the case and sorry for the confusion.

          Patch looks good to me.

          Show
          Rakesh R added a comment - Oh, I didn't consider one flow like after the edit log conversion, immediately #store failed. Now again in the next trip of finalization, it will throw NodeExistsException, here it is really required to do the #store. I misunderstood the case and sorry for the confusion. Patch looks good to me.
          Hide
          Ivan Kelly added a comment -

          How would you change it though? I don't see a clean way to avoid the store that keeps the same clarity.

          Show
          Ivan Kelly added a comment - How would you change it though? I don't see a clean way to avoid the store that keeps the same clarity.
          Hide
          Rakesh R added a comment -

          @Ivan

          The case where the #store is unnecessary(i.e. a crash during finalization) is much rarer than where it is necessary (i.e. crash while writing).

          Yes, exactly this is what I mean. Even though its corner case, IMHO, still we have opportunity to refactor, since touching BKJM flows(keep off #store in this case, if agrees)

          By the way, the test cases are nice.

          Show
          Rakesh R added a comment - @Ivan The case where the #store is unnecessary(i.e. a crash during finalization) is much rarer than where it is necessary (i.e. crash while writing). Yes, exactly this is what I mean. Even though its corner case, IMHO, still we have opportunity to refactor, since touching BKJM flows(keep off #store in this case, if agrees) By the way, the test cases are nice.
          Hide
          Ivan Kelly added a comment -

          @Rakesh
          In the case of a primary nn crashing, the maxTxId.store() is necessary on recover, to ensure that the value is correct. The case where the #store is unnecessary(i.e. a crash during finalization) is much rarer than where it is necessary (i.e. crash while writing).

          Show
          Ivan Kelly added a comment - @Rakesh In the case of a primary nn crashing, the maxTxId.store() is necessary on recover, to ensure that the value is correct. The case where the #store is unnecessary(i.e. a crash during finalization) is much rarer than where it is necessary (i.e. crash while writing).
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12528085/HDFS-3423.diff
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified test files.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2477//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528085/HDFS-3423.diff against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified test files. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2477//console This message is automatically generated.
          Hide
          Rakesh R added a comment -

          Yes Ivan, I understand and completely agree with you and would not be any functional problem. But avoiding #store would save few ZK calls.

          Show
          Rakesh R added a comment - Yes Ivan, I understand and completely agree with you and would not be any functional problem. But avoiding #store would save few ZK calls.
          Hide
          Ivan Kelly added a comment -

          New patch fixes findbugs

          Show
          Ivan Kelly added a comment - New patch fixes findbugs
          Hide
          Ivan Kelly added a comment -

          If the store here is to a txid older than what we currently have, nothing with happen. Take a look at the implementation of #store.

          Show
          Ivan Kelly added a comment - If the store here is to a txid older than what we currently have, nothing with happen. Take a look at the implementation of #store.
          Hide
          Rakesh R added a comment -

          In the patch I'm seeing only maxTxId.reset(maxTxId.get()-1); is invoked on SegmentEmptyException. But I'm thinking about the inprogress_x ledgers which are not empty and had previously finalized but not deleted.

          The following code I have taken from BKJM. When l.verify(zkc, finalisedPath) == true, here instead of storing the maxTxId and deleting the znode, we will only delete the inprogress_x node as we had corresponding edit_x_y log file exists. IMHO, this is more safer. what's your opinion?

                try {
                  l.write(zkc, finalisedPath);
                } catch (KeeperException.NodeExistsException nee) {
                  if (!l.verify(zkc, finalisedPath)) {
                    throw new IOException("Node " + finalisedPath + " already exists"
                                          + " but data doesn't match");
                  }
                }
                maxTxId.store(lastTxId);
                zkc.delete(inprogressPath, inprogressStat.getVersion());
          
          Show
          Rakesh R added a comment - In the patch I'm seeing only maxTxId.reset(maxTxId.get()-1); is invoked on SegmentEmptyException. But I'm thinking about the inprogress_x ledgers which are not empty and had previously finalized but not deleted. The following code I have taken from BKJM. When l.verify(zkc, finalisedPath) == true, here instead of storing the maxTxId and deleting the znode, we will only delete the inprogress_x node as we had corresponding edit_x_y log file exists. IMHO, this is more safer. what's your opinion? try { l.write(zkc, finalisedPath); } catch (KeeperException.NodeExistsException nee) { if (!l.verify(zkc, finalisedPath)) { throw new IOException("Node " + finalisedPath + " already exists" + " but data doesn't match"); } } maxTxId.store(lastTxId); zkc.delete(inprogressPath, inprogressStat.getVersion());
          Hide
          Ivan Kelly added a comment -

          This can't happen, as maxid always grows unless it is reset() which is only ever called with maxid.get()-1. And this situation will only occur when the last inprogress znode points to a ledger entry.

          Show
          Ivan Kelly added a comment - This can't happen, as maxid always grows unless it is reset() which is only ever called with maxid.get()-1. And this situation will only occur when the last inprogress znode points to a ledger entry.
          Hide
          Rakesh R added a comment -

          Hi Ivan,

          I'have just gone through the patch. Its great!

          I have one small doubt about the old inprogress_znodes. Say I'm having inprogress_27, inprogress_65. Is there any ordering guarantees for ZooKeeper znode children, like inprogress_27 come first in list before inprogress_65 ?. If not, I could see one problem of reaching 'maxTxId' with 27.

          -Rakesh

          Show
          Rakesh R added a comment - Hi Ivan, I'have just gone through the patch. Its great! I have one small doubt about the old inprogress_znodes. Say I'm having inprogress_27, inprogress_65. Is there any ordering guarantees for ZooKeeper znode children, like inprogress_27 come first in list before inprogress_65 ?. If not, I could see one problem of reaching 'maxTxId' with 27. -Rakesh
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12527365/HDFS-3423.diff
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified test files.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2440//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12527365/HDFS-3423.diff against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified test files. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2440//console This message is automatically generated.
          Hide
          Ivan Kelly added a comment -

          This patch requires HDFS-3058 to apply

          Show
          Ivan Kelly added a comment - This patch requires HDFS-3058 to apply
          Hide
          surendra singh lilhore added a comment -

          Hi Ivan,

          I’m attaching the detailed information which is taken from my env and hope will help us :
          BKJM edit log path in ZK -> /NN/ledgers. Here /NN is the root zNode for BKJM in zookeeper.

          [zk: localhost:2181(CONNECTED) 2] ls /NN/ledgers
          [edits_000000000000000093_000000000000000094, edits_000000000000000089_000000000000000090, inprogress_27,
          edits_000000000000000039_000000000000000076, edits_000000000000000001_000000000000000038, edits_000000000000000091_000000000000000092, 
          inprogress_65, inprogress_64, edits_000000000000000098_000000000000000099, inprogress_61, edits_000000000000000095_000000000000000096, 
          edits_000000000000000087_000000000000000088, edits_000000000000000079_000000000000000086, edits_000000000000000077_000000000000000078]
          

          Here /NN/ledgers zNode contain four inprogress zNode inprogress_27, inprogress_65, inprogress_64, inprogress_61. But inprogress_65 is only active inprogress zNode and other inprogress zNode are old which is not removed by BK.

          Show
          surendra singh lilhore added a comment - Hi Ivan, I’m attaching the detailed information which is taken from my env and hope will help us : BKJM edit log path in ZK -> /NN/ledgers. Here /NN is the root zNode for BKJM in zookeeper. [zk: localhost:2181(CONNECTED) 2] ls /NN/ledgers [edits_000000000000000093_000000000000000094, edits_000000000000000089_000000000000000090, inprogress_27, edits_000000000000000039_000000000000000076, edits_000000000000000001_000000000000000038, edits_000000000000000091_000000000000000092, inprogress_65, inprogress_64, edits_000000000000000098_000000000000000099, inprogress_61, edits_000000000000000095_000000000000000096, edits_000000000000000087_000000000000000088, edits_000000000000000079_000000000000000086, edits_000000000000000077_000000000000000078] Here /NN/ledgers zNode contain four inprogress zNode inprogress_27, inprogress_65, inprogress_64, inprogress_61. But inprogress_65 is only active inprogress zNode and other inprogress zNode are old which is not removed by BK.
          Hide
          Ivan Kelly added a comment -

          By which I mean, it'll be in the patch for this issue.

          Show
          Ivan Kelly added a comment - By which I mean, it'll be in the patch for this issue.
          Hide
          Ivan Kelly added a comment -

          This second problem shouldn't happen, as if the metadata is the same for what has been finalised and for the inprogress znode, then the inprogress znode is silently removed. However, a typo was preventing this. Fixed now.

          Show
          Ivan Kelly added a comment - This second problem shouldn't happen, as if the metadata is the same for what has been finalised and for the inprogress znode, then the inprogress znode is silently removed. However, a typo was preventing this. Fixed now.
          Hide
          surendra singh lilhore added a comment -

          In one more scenario BKJM fail to recover bad inProgress_ZNodes and throwing following Exception:

          Say, NN has successfully done finalizeLogSegment() and unfortunately (kill the namenode or ZK cluster down) before deleting the inprogress Znode. The inProgress_zNode would be left in the ZooKeeper. On next NN start up, it will again tries to perform recovery and throwing the following exception.

          java.io.IOException: Node /NN/ledgers/edits_000000000000000039_000000000000000076 already exists but data doesn't match
          	at org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.finalizeLogSegment(BookKeeperJournalManager.java:306)
          	at org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.recoverUnfinalizedSegments(BookKeeperJournalManager.java:426)
          	at org.apache.hadoop.hdfs.server.namenode.JournalSet$6.apply(JournalSet.java:551)
          	at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:322)
          	at org.apache.hadoop.hdfs.server.namenode.JournalSet.recoverUnfinalizedSegments(JournalSet.java:548)
          	at org.apache.hadoop.hdfs.server.namenode.FSEditLog.recoverUnclosedStreams(FSEditLog.java:1134)
          	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:598)
          	at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1287)
          	at org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
          	at org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:63)
          	at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49)
          	at org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1219)
          	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:978)
          	at org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107)
          	at org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:3633)
          	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427)
          	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:916)
          	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692)
          	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688)
          	at java.security.AccessController.doPrivileged(Native Method)
          	at javax.security.auth.Subject.doAs(Subject.java:396)
          	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
          	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1686)
          

          Here, I feel throw exception and will delete the old inprogress file because I have seen after restart the namenode /NamenodeNode/ledgers
          contain two inprogress znode.

          Show
          surendra singh lilhore added a comment - In one more scenario BKJM fail to recover bad inProgress_ZNodes and throwing following Exception: Say, NN has successfully done finalizeLogSegment() and unfortunately (kill the namenode or ZK cluster down) before deleting the inprogress Znode. The inProgress_zNode would be left in the ZooKeeper. On next NN start up, it will again tries to perform recovery and throwing the following exception. java.io.IOException: Node /NN/ledgers/edits_000000000000000039_000000000000000076 already exists but data doesn't match at org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.finalizeLogSegment(BookKeeperJournalManager.java:306) at org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.recoverUnfinalizedSegments(BookKeeperJournalManager.java:426) at org.apache.hadoop.hdfs.server.namenode.JournalSet$6.apply(JournalSet.java:551) at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:322) at org.apache.hadoop.hdfs.server.namenode.JournalSet.recoverUnfinalizedSegments(JournalSet.java:548) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.recoverUnclosedStreams(FSEditLog.java:1134) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:598) at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1287) at org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61) at org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:63) at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49) at org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1219) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:978) at org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107) at org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:3633) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:916) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1686) Here, I feel throw exception and will delete the old inprogress file because I have seen after restart the namenode /NamenodeNode/ledgers contain two inprogress znode.
          Hide
          Rakesh R added a comment -

          Yup, I also feel its good to retain empty inprogress znode and this is not harming anything(if required later we would archive). But NN should be able to start to the last successful transaction.

          I agree with a warning message and continue the NN startup.

          Show
          Rakesh R added a comment - Yup, I also feel its good to retain empty inprogress znode and this is not harming anything(if required later we would archive). But NN should be able to start to the last successful transaction. I agree with a warning message and continue the NN startup.
          Hide
          Ivan Kelly added a comment -

          Ah yes, I'll work on a patch for this. Do you think an empty inprogress znode should be deleted completely, or archived somewhere? Personally, I think that warning about it should be enough.

          Show
          Ivan Kelly added a comment - Ah yes, I'll work on a patch for this. Do you think an empty inprogress znode should be deleted completely, or archived somewhere? Personally, I think that warning about it should be enough.
          Hide
          Rakesh R added a comment -

          Oh, I could have attach the logs. Thanks for the analysis.

          I have killed the Active NN after creating new ledger and inprogress znode is in zookeeper(Before writing any entry in new ledger).

          Following is the logs:

          2012-05-09 16:55:11,349 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server 10.18.40.155/10.18.40.155:2181, sessionid = 0x137314c462c0009, negotiated timeout = 4000
          2012-05-09 16:55:11,396 FATAL org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: recoverUnfinalizedSegments failed for required journal (JournalAndStream(mgr=org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager@193a83cc, stream=null))
          java.io.IOException: Exception retreiving last tx id for ledger [LedgerId:7, firstTxId:18, lastTxId:-12345, version:-40]
          	at org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.recoverLastTxId(BookKeeperJournalManager.java:516)
          	at org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.recoverUnfinalizedSegments(BookKeeperJournalManager.java:418)
          	at org.apache.hadoop.hdfs.server.namenode.JournalSet$6.apply(JournalSet.java:551)
          	at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:322)
          	at org.apache.hadoop.hdfs.server.namenode.JournalSet.recoverUnfinalizedSegments(JournalSet.java:548)
          	at org.apache.hadoop.hdfs.server.namenode.FSEditLog.recoverUnclosedStreams(FSEditLog.java:1134)
          	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:598)
          	at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1287)
          	at org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
          	at org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:63)
          	at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49)
          	at org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1219)
          	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:978)
          	at org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107)
          	at org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:3633)
          	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427)
          	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:916)
          	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692)
          	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688)
          	at java.security.AccessController.doPrivileged(Native Method)
          	at javax.security.auth.Subject.doAs(Subject.java:396)
          	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
          	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1686)
          Caused by: java.io.IOException: Error reading entries from bookkeeper
          	at org.apache.hadoop.contrib.bkjournal.BookKeeperEditLogInputStream$LedgerInputStream.nextStream(BookKeeperEditLogInputStream.java:198)
          	at org.apache.hadoop.contrib.bkjournal.BookKeeperEditLogInputStream$LedgerInputStream.read(BookKeeperEditLogInputStream.java:218)
          	at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
          	at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
          	at java.io.FilterInputStream.read(FilterInputStream.java:66)
          	at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader$PositionTrackingInputStream.read(FSEditLogLoader.java:734)
          	at java.io.FilterInputStream.read(FilterInputStream.java:66)
          	at java.util.zip.CheckedInputStream.read(CheckedInputStream.java:42)
          	at java.io.DataInputStream.readByte(DataInputStream.java:248)
          	at org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$Reader.decodeOp(FSEditLogOp.java:2275)
          	at org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$Reader.readOp(FSEditLogOp.java:2248)
          	at org.apache.hadoop.contrib.bkjournal.BookKeeperEditLogInputStream.nextOp(BookKeeperEditLogInputStream.java:100)
          	at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:74)
          	at org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.recoverLastTxId(BookKeeperJournalManager.java:506)
          	... 22 more
          Caused by: org.apache.bookkeeper.client.BKException$BKReadException
          	at org.apache.bookkeeper.client.BKException.create(BKException.java:48)
          	at org.apache.bookkeeper.client.LedgerHandle.readEntries(LedgerHandle.java:302)
          	at org.apache.hadoop.contrib.bkjournal.BookKeeperEditLogInputStream$LedgerInputStream.nextStream(BookKeeperEditLogInputStream.java:190)
          	... 35 more
          2012-05-09 16:55:11,397 INFO org.apache.hadoop.contrib.bkjournal.WriteLock: Zookeeper event WatchedEvent state:SyncConnected type:NodeDeleted path:/ledgers/lock/lock-0000000005 received, reapplying watch to null
          2012-05-09 16:55:11,400 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: 
          /************************************************************
          SHUTDOWN_MSG: Shutting down NameNode at HOST-10-18-40-91/10.18.40.91
          ************************************************************/
          

          I feel in BKJM, no logic for a ledger which does not contain any entry.
          I think if current ledger not contains any entry it should take end txnId from previous ledger.

          long endTxId = HdfsConstants.INVALID_TXID;
                FSEditLogOp op = in.readOp();
                while (op != null) {
                  if (endTxId == HdfsConstants.INVALID_TXID
                      || op.getTransactionId() == endTxId+1) {
                    endTxId = op.getTransactionId();
                  }
                  op = in.readOp();
                }
                return endTxId;
          
          Show
          Rakesh R added a comment - Oh, I could have attach the logs. Thanks for the analysis. I have killed the Active NN after creating new ledger and inprogress znode is in zookeeper(Before writing any entry in new ledger). Following is the logs: 2012-05-09 16:55:11,349 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server 10.18.40.155/10.18.40.155:2181, sessionid = 0x137314c462c0009, negotiated timeout = 4000 2012-05-09 16:55:11,396 FATAL org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: recoverUnfinalizedSegments failed for required journal (JournalAndStream(mgr=org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager@193a83cc, stream=null)) java.io.IOException: Exception retreiving last tx id for ledger [LedgerId:7, firstTxId:18, lastTxId:-12345, version:-40] at org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.recoverLastTxId(BookKeeperJournalManager.java:516) at org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.recoverUnfinalizedSegments(BookKeeperJournalManager.java:418) at org.apache.hadoop.hdfs.server.namenode.JournalSet$6.apply(JournalSet.java:551) at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:322) at org.apache.hadoop.hdfs.server.namenode.JournalSet.recoverUnfinalizedSegments(JournalSet.java:548) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.recoverUnclosedStreams(FSEditLog.java:1134) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:598) at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1287) at org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61) at org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:63) at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49) at org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1219) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:978) at org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107) at org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:3633) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:916) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1686) Caused by: java.io.IOException: Error reading entries from bookkeeper at org.apache.hadoop.contrib.bkjournal.BookKeeperEditLogInputStream$LedgerInputStream.nextStream(BookKeeperEditLogInputStream.java:198) at org.apache.hadoop.contrib.bkjournal.BookKeeperEditLogInputStream$LedgerInputStream.read(BookKeeperEditLogInputStream.java:218) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read(BufferedInputStream.java:237) at java.io.FilterInputStream.read(FilterInputStream.java:66) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader$PositionTrackingInputStream.read(FSEditLogLoader.java:734) at java.io.FilterInputStream.read(FilterInputStream.java:66) at java.util.zip.CheckedInputStream.read(CheckedInputStream.java:42) at java.io.DataInputStream.readByte(DataInputStream.java:248) at org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$Reader.decodeOp(FSEditLogOp.java:2275) at org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$Reader.readOp(FSEditLogOp.java:2248) at org.apache.hadoop.contrib.bkjournal.BookKeeperEditLogInputStream.nextOp(BookKeeperEditLogInputStream.java:100) at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:74) at org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.recoverLastTxId(BookKeeperJournalManager.java:506) ... 22 more Caused by: org.apache.bookkeeper.client.BKException$BKReadException at org.apache.bookkeeper.client.BKException.create(BKException.java:48) at org.apache.bookkeeper.client.LedgerHandle.readEntries(LedgerHandle.java:302) at org.apache.hadoop.contrib.bkjournal.BookKeeperEditLogInputStream$LedgerInputStream.nextStream(BookKeeperEditLogInputStream.java:190) ... 35 more 2012-05-09 16:55:11,397 INFO org.apache.hadoop.contrib.bkjournal.WriteLock: Zookeeper event WatchedEvent state:SyncConnected type:NodeDeleted path:/ledgers/lock/lock-0000000005 received, reapplying watch to null 2012-05-09 16:55:11,400 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at HOST-10-18-40-91/10.18.40.91 ************************************************************/ I feel in BKJM, no logic for a ledger which does not contain any entry. I think if current ledger not contains any entry it should take end txnId from previous ledger. long endTxId = HdfsConstants.INVALID_TXID; FSEditLogOp op = in.readOp(); while (op != null) { if (endTxId == HdfsConstants.INVALID_TXID || op.getTransactionId() == endTxId+1) { endTxId = op.getTransactionId(); } op = in.readOp(); } return endTxId;
          Hide
          Ivan Kelly added a comment -

          Have you seen this occur in your testing?

          There is only a single read of the zookeeper data, in the same way there is only a single write(in EditLogLedgerMetadata#create). I make the assumption that a ZooKeeper#getData(),#create or #setData is atomic.

          Show
          Ivan Kelly added a comment - Have you seen this occur in your testing? There is only a single read of the zookeeper data, in the same way there is only a single write(in EditLogLedgerMetadata#create). I make the assumption that a ZooKeeper#getData(),#create or #setData is atomic.
          Hide
          Rakesh R added a comment -

          This is an endless condition, not allowing to start the NN as it has bad inprogress zNodes. IMHO, we would consider its like dirty or partial data, good to delete those entries by giving warning messages. Otw for starting NN, admin has to manually do the cleanups from the ZooKeeper.

          Show
          Rakesh R added a comment - This is an endless condition, not allowing to start the NN as it has bad inprogress zNodes. IMHO, we would consider its like dirty or partial data, good to delete those entries by giving warning messages. Otw for starting NN, admin has to manually do the cleanups from the ZooKeeper.

            People

            • Assignee:
              Ivan Kelly
              Reporter:
              Rakesh R
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development