Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-2982

Startup performance suffers when there are many edit log segments

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 2.0.0-alpha
    • Fix Version/s: 2.0.2-alpha
    • Component/s: namenode
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      For every one of the edit log segments, it seems like we are calling listFiles on the edit log directory inside of findMaxTransaction. This is killing performance, especially when there are many log segments and the directory is stored on NFS. It is taking several minutes to start up the NN when there are several thousand log segments present.

      1. HDFS-2982.009.patch
        87 kB
        Colin Patrick McCabe
      2. HDFS-2982.008.patch
        87 kB
        Colin Patrick McCabe
      3. HDFS-2982.007.patch
        87 kB
        Colin Patrick McCabe
      4. HDFS-2982.006.patch
        87 kB
        Colin Patrick McCabe
      5. HDFS-2982.005.patch
        84 kB
        Colin Patrick McCabe
      6. HDFS-2982.004.patch
        84 kB
        Colin Patrick McCabe
      7. HDFS-2982.003.patch
        84 kB
        Colin Patrick McCabe
      8. HDFS-2982.002.patch
        84 kB
        Colin Patrick McCabe
      9. HDFS-2982.001.patch
        87 kB
        Colin Patrick McCabe

        Issue Links

          Activity

          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #2302 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2302/)
          HDFS-2982. Startup performance suffers when there are many edit log segments. Contributed by Colin Patrick McCabe. (Revision 1342042)

          Result = FAILURE
          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1342042
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperEditLogInputStream.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperJournalManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogTestUtil.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupJournalManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EditLogBackupInputStream.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileInputStream.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EditLogInputStream.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/JournalManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/JournalSet.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/BootstrapStandby.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSRollback.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestEditLog.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestEditLogFileOutputStream.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSEditLogLoader.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileJournalManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestGenericJournalConf.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeRecovery.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestFailureToReadEdits.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #2302 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2302/ ) HDFS-2982 . Startup performance suffers when there are many edit log segments. Contributed by Colin Patrick McCabe. (Revision 1342042) Result = FAILURE todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1342042 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperEditLogInputStream.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperJournalManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogTestUtil.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupJournalManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EditLogBackupInputStream.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileInputStream.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EditLogInputStream.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/JournalManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/JournalSet.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/BootstrapStandby.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSRollback.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestEditLog.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestEditLogFileOutputStream.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSEditLogLoader.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileJournalManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestGenericJournalConf.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeRecovery.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestFailureToReadEdits.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #2357 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2357/)
          HDFS-2982. Startup performance suffers when there are many edit log segments. Contributed by Colin Patrick McCabe. (Revision 1342042)

          Result = SUCCESS
          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1342042
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperEditLogInputStream.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperJournalManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogTestUtil.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupJournalManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EditLogBackupInputStream.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileInputStream.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EditLogInputStream.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/JournalManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/JournalSet.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/BootstrapStandby.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSRollback.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestEditLog.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestEditLogFileOutputStream.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSEditLogLoader.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileJournalManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestGenericJournalConf.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeRecovery.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestFailureToReadEdits.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #2357 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2357/ ) HDFS-2982 . Startup performance suffers when there are many edit log segments. Contributed by Colin Patrick McCabe. (Revision 1342042) Result = SUCCESS todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1342042 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperEditLogInputStream.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperJournalManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogTestUtil.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupJournalManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EditLogBackupInputStream.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileInputStream.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EditLogInputStream.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/JournalManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/JournalSet.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/BootstrapStandby.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSRollback.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestEditLog.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestEditLogFileOutputStream.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSEditLogLoader.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileJournalManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestGenericJournalConf.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeRecovery.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestFailureToReadEdits.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #2284 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2284/)
          HDFS-2982. Startup performance suffers when there are many edit log segments. Contributed by Colin Patrick McCabe. (Revision 1342042)

          Result = SUCCESS
          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1342042
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperEditLogInputStream.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperJournalManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogTestUtil.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupJournalManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EditLogBackupInputStream.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileInputStream.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EditLogInputStream.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/JournalManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/JournalSet.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/BootstrapStandby.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSRollback.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestEditLog.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestEditLogFileOutputStream.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSEditLogLoader.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileJournalManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestGenericJournalConf.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeRecovery.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestFailureToReadEdits.java
          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #2284 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2284/ ) HDFS-2982 . Startup performance suffers when there are many edit log segments. Contributed by Colin Patrick McCabe. (Revision 1342042) Result = SUCCESS todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1342042 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperEditLogInputStream.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperJournalManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogTestUtil.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupJournalManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EditLogBackupInputStream.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileInputStream.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EditLogInputStream.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/JournalManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/JournalSet.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/BootstrapStandby.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSRollback.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestEditLog.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestEditLogFileOutputStream.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSEditLogLoader.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileJournalManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestGenericJournalConf.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeRecovery.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestFailureToReadEdits.java
          Hide
          Todd Lipcon added a comment -

          Committed to branch-2 and trunk

          Show
          Todd Lipcon added a comment - Committed to branch-2 and trunk
          Hide
          Todd Lipcon added a comment -

          +1, looks good to me. Thanks, Colin, I'll commit this momentarily.

          Show
          Todd Lipcon added a comment - +1, looks good to me. Thanks, Colin, I'll commit this momentarily.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12528535/HDFS-2982.009.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2500//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2500//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528535/HDFS-2982.009.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2500//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2500//console This message is automatically generated.
          Hide
          Colin Patrick McCabe added a comment -
          • fix comment issues, test cleanup issues
          Show
          Colin Patrick McCabe added a comment - fix comment issues, test cleanup issues
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12528506/HDFS-2982.008.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2498//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2498//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528506/HDFS-2982.008.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2498//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2498//console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          Oops, missed part of my copy-paste comment:

          +    LOG.info(this + ": selecting input streams starting at " + fromTxId +
          +        (inProgressOk ? " (inProgress ok) " : " (excluding inProgress) ") +
          +        "from among " + elfs.size() + " candidate file(s)");
          

          This should probably be DEBUG level. Otherwise this will show up in the "safety check" selectInputStreams call, and potentially confuse users.

          Show
          Todd Lipcon added a comment - Oops, missed part of my copy-paste comment: + LOG.info( this + ": selecting input streams starting at " + fromTxId + + (inProgressOk ? " (inProgress ok) " : " (excluding inProgress) " ) + + "from among " + elfs.size() + " candidate file(s)" ); This should probably be DEBUG level. Otherwise this will show up in the "safety check" selectInputStreams call, and potentially confuse users.
          Hide
          Todd Lipcon added a comment -
          +            LOG.info("skipping the rest of " + this + " since we " +
          +                "reached txId " + txId);
          +            FSImage.LOG.warn("skipping " + skipAmt + " bytes at the end " +
                         "of edit log  '" + getName() + "': reached txid " + txId +
                         " out of " + lastTxId);
          

          No need for two log messages for the same thing.


          TestFileJournalManager.getJournalInputStream leaks all of the streams expect the specified one.

          Show
          Todd Lipcon added a comment - + LOG.info( "skipping the rest of " + this + " since we " + + "reached txId " + txId); + FSImage.LOG.warn( "skipping " + skipAmt + " bytes at the end " + "of edit log '" + getName() + "': reached txid " + txId + " out of " + lastTxId); No need for two log messages for the same thing. TestFileJournalManager.getJournalInputStream leaks all of the streams expect the specified one.
          Hide
          Colin Patrick McCabe added a comment -
          • avoid silly findbugs warning about case statement fall through (use recursion instead)
          Show
          Colin Patrick McCabe added a comment - avoid silly findbugs warning about case statement fall through (use recursion instead)
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12528468/HDFS-2982.007.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2496//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/2496//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2496//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528468/HDFS-2982.007.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2496//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/2496//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2496//console This message is automatically generated.
          Hide
          Colin Patrick McCabe added a comment -

          rebase on trunk

          Show
          Colin Patrick McCabe added a comment - rebase on trunk
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12528463/HDFS-2982.006.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified test files.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2495//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528463/HDFS-2982.006.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified test files. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2495//console This message is automatically generated.
          Hide
          Colin Patrick McCabe added a comment -

          I renamed the resync parameter to skipBrokenEdits, since that's what it is in Reader#readOp, and this function just passes it on to there. That is a pretty concise description of what it does.

          The changes to TestNameNodeRecovery are for correctness. Formerly, we were doing multiple mkdirs operations on the same directory. This resulted in only one mkdir operation getting added to the stream. Then when we corrupted the last edit, the mkdir operation was lost-- a bad thing, since we check for it later.

          I'm not 100% sure if calling cluster.waitActive() in this test is necessary, since we have 0 DataNodes. However, we do it everywhere else, and consistency is a good thing. Also, conceptually what we want is for the NameNode to come up and be active. It seems more robust to check for that directly rather than assuming that no part of edit log loading happens in the background.

          Show
          Colin Patrick McCabe added a comment - I renamed the resync parameter to skipBrokenEdits, since that's what it is in Reader#readOp, and this function just passes it on to there. That is a pretty concise description of what it does. The changes to TestNameNodeRecovery are for correctness. Formerly, we were doing multiple mkdirs operations on the same directory. This resulted in only one mkdir operation getting added to the stream. Then when we corrupted the last edit, the mkdir operation was lost-- a bad thing, since we check for it later. I'm not 100% sure if calling cluster.waitActive() in this test is necessary, since we have 0 DataNodes. However, we do it everywhere else, and consistency is a good thing. Also, conceptually what we want is for the NameNode to come up and be active. It seems more robust to check for that directly rather than assuming that no part of edit log loading happens in the background.
          Hide
          Todd Lipcon added a comment -
          • in BookKeeperJournalManager, getInputStream and getNumberOfTransactions can be made package-private now, right, since they aren't part of the JM API anymore?

          EditLogFileInputStream:

          • in init(), add Preconditions.checkState(state == State.UNINIT);
          • I would expect to see state = State.OPEN at the end of init(), not at its call site
          • in nextOpImpl, combine the two catch clauses into a catch (Throwable t) to avoid the duplicate code. You can use the Throwables class from Guava to propagate the exception without too much messy code.

          +        if (!resync) {
          +          throw e;
          +        }
          +        return null;
          
          • I think it's clearer to invert the logic here:
            if (resync) {
              return null;
            }
            throw e;
            

            since resync is the more unusual case, whereas rethrowing is the usual.

          Also, I think the resync parameter would be better named treatErrorsAsEof or something – it's not actually resynchronizing, it's just ignoring a single error and treating it like there being no more valid ops.


          +          LOG.info("skipping the rest of " + this + " since we " +
          +              "reached txId " + txId);
          +          long skipAmt = file.length() - tracker.getPos();
          +          if (skipAmt > 0) {
          

          This will get printed on every log, right? I think we should only log if skipAmt > 0.


          +      return nextOpImpl(true);
               } catch (IOException e) {
          +      LOG.error("nextValidOp: got exception while reading " + this, e);
          +      return null;
          +    } catch (RuntimeException e) {
          +      LOG.error("nextValidOp: got exception while reading " + this, e);
          

          can you combine the catch clauses, same as above?


          +          LOG.error("got IOException while trying to validate header of " +
          +              elf + ".  Skipping.", e);
          

          +      if (recovery == null) {
          +        closeAllStreams(streams);
          +        throw e;
          +      }
          +      LOG.error(e);
          

          Same as above - I think it is easier to understand as:

          if (recovery != null) {
            LOG.error(e);
            // and continue
          } else {
            closeAllStreams(streams);
            throw e;
          }
          

          +    // This code will go away as soon as RedundantEditLogInputStream is
          +    // introduced.
          

          Worth adding a reference to the JIRA here so people can find what you're talking about in the interim.


          +    // We don't want to throw an exception from here, because that would make
          +    // recovery impossible even if the user requested it.
          

          This comment makes more sense inside the catch clause. Also should add something like: an exception will be thrown later when actually applying edits, since we verify txids for each operation


          +      txId = elis.getLastTxId() + 1;
          

          can you add an assert elis.getLastTxId() > 0 here or some other sanity check to make sure we didn't accidentally get an in-progress segment?


          • I'm still unconvinced that the changes to validateEditLog belong here. They might be good changes, but they don't have anything to do with the performance fix. Your justification above was about why the changes are important for recovery mode, not why they're important for getting rid of the n^2 behavior of startup, which is what this JIRA is about.
          • Similarly, the other changes in FSEditLogLoader.java (regarding the treatment of logVersion) don't seem to have anything to do with this change.

          +    EDIT_LOG_INPUT_STREAM_COMPARATOR = new Comparator<EditLogInputStream>() {
          +      @Override
          +      public int compare(EditLogInputStream a, EditLogInputStream b) {
          +        long fa = a.getFirstTxId();
          +        long fb = b.getFirstTxId();
          +        if (fa < fb) {
          +          return -1;
          +        } else if (fa > fb) {
          +          return 1;
          +        }
          +        long la = a.getLastTxId();
          +        long lb = b.getLastTxId();
          +        if (lb < la) {
          +          return -1;
          +        } else if (lb > la) {
          +          return 1;
          +        }
          +        return a.getName().compareTo(b.getName());
          +      }
          +    };
          

          Use ComparisonChain from Guava here - much easier to read.


          • The algorithm in JournalSet.selectInputStreams could do with some explanation in an inline comment.
          • testManyEditLogSegments still missing the @Test annotation.
          • prepareUnfinalizedTestEditLog needs javadoc - especially given it has an "out-parameter" - it's not obvious what it does without doc
          • the "out-parameter" in fact doesn't appear to be used at all..

          +   * @return                The number of transactions
          +   * @throws IOException
          +   */
          +  static long getNumberOfTransactions(FileJournalManager jm, long fromTxId,
          +      boolean inProgressOk, boolean abortOnGap) throws IOException {
          

          this function doesn't seem to close its streams (I see it's moved code, but may as well fix this while you move it)


          • what's the reasoning of the changes in TestNameNodeRecovery?
          Show
          Todd Lipcon added a comment - in BookKeeperJournalManager, getInputStream and getNumberOfTransactions can be made package-private now, right, since they aren't part of the JM API anymore? EditLogFileInputStream: in init(), add Preconditions.checkState(state == State.UNINIT); I would expect to see state = State.OPEN at the end of init(), not at its call site in nextOpImpl , combine the two catch clauses into a catch (Throwable t) to avoid the duplicate code. You can use the Throwables class from Guava to propagate the exception without too much messy code. + if (!resync) { + throw e; + } + return null ; I think it's clearer to invert the logic here: if (resync) { return null ; } throw e; since resync is the more unusual case, whereas rethrowing is the usual. Also, I think the resync parameter would be better named treatErrorsAsEof or something – it's not actually resynchronizing, it's just ignoring a single error and treating it like there being no more valid ops. + LOG.info( "skipping the rest of " + this + " since we " + + "reached txId " + txId); + long skipAmt = file.length() - tracker.getPos(); + if (skipAmt > 0) { This will get printed on every log, right? I think we should only log if skipAmt > 0. + return nextOpImpl( true ); } catch (IOException e) { + LOG.error( "nextValidOp: got exception while reading " + this , e); + return null ; + } catch (RuntimeException e) { + LOG.error( "nextValidOp: got exception while reading " + this , e); can you combine the catch clauses, same as above? + LOG.error( "got IOException while trying to validate header of " + + elf + ". Skipping." , e); + if (recovery == null ) { + closeAllStreams(streams); + throw e; + } + LOG.error(e); Same as above - I think it is easier to understand as: if (recovery != null ) { LOG.error(e); // and continue } else { closeAllStreams(streams); throw e; } + // This code will go away as soon as RedundantEditLogInputStream is + // introduced. Worth adding a reference to the JIRA here so people can find what you're talking about in the interim. + // We don't want to throw an exception from here, because that would make + // recovery impossible even if the user requested it. This comment makes more sense inside the catch clause. Also should add something like: an exception will be thrown later when actually applying edits, since we verify txids for each operation + txId = elis.getLastTxId() + 1; can you add an assert elis.getLastTxId() > 0 here or some other sanity check to make sure we didn't accidentally get an in-progress segment? I'm still unconvinced that the changes to validateEditLog belong here. They might be good changes, but they don't have anything to do with the performance fix. Your justification above was about why the changes are important for recovery mode, not why they're important for getting rid of the n^2 behavior of startup, which is what this JIRA is about. Similarly, the other changes in FSEditLogLoader.java (regarding the treatment of logVersion ) don't seem to have anything to do with this change. + EDIT_LOG_INPUT_STREAM_COMPARATOR = new Comparator<EditLogInputStream>() { + @Override + public int compare(EditLogInputStream a, EditLogInputStream b) { + long fa = a.getFirstTxId(); + long fb = b.getFirstTxId(); + if (fa < fb) { + return -1; + } else if (fa > fb) { + return 1; + } + long la = a.getLastTxId(); + long lb = b.getLastTxId(); + if (lb < la) { + return -1; + } else if (lb > la) { + return 1; + } + return a.getName().compareTo(b.getName()); + } + }; Use ComparisonChain from Guava here - much easier to read. The algorithm in JournalSet.selectInputStreams could do with some explanation in an inline comment. testManyEditLogSegments still missing the @Test annotation. prepareUnfinalizedTestEditLog needs javadoc - especially given it has an "out-parameter" - it's not obvious what it does without doc the "out-parameter" in fact doesn't appear to be used at all.. + * @ return The number of transactions + * @ throws IOException + */ + static long getNumberOfTransactions(FileJournalManager jm, long fromTxId, + boolean inProgressOk, boolean abortOnGap) throws IOException { this function doesn't seem to close its streams (I see it's moved code, but may as well fix this while you move it) what's the reasoning of the changes in TestNameNodeRecovery?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12528152/HDFS-2982.005.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 javadoc. The javadoc tool appears to have generated 2 warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal:

          org.apache.hadoop.hdfs.TestDatanodeBlockScanner

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2481//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2481//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528152/HDFS-2982.005.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 javadoc. The javadoc tool appears to have generated 2 warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal: org.apache.hadoop.hdfs.TestDatanodeBlockScanner +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2481//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2481//console This message is automatically generated.
          Hide
          Colin Patrick McCabe added a comment -

          Just as a note, the EditLogTailer issue was a side effect of separating this patch from HDFS-3049. Formerly, RedundantEditLogInputStream would handle doing the skipUntil that we are now doing in selectInputStreams.

          Show
          Colin Patrick McCabe added a comment - Just as a note, the EditLogTailer issue was a side effect of separating this patch from HDFS-3049 . Formerly, RedundantEditLogInputStream would handle doing the skipUntil that we are now doing in selectInputStreams.
          Hide
          Colin Patrick McCabe added a comment -
          • fix EditLogTailer
          Show
          Colin Patrick McCabe added a comment - fix EditLogTailer
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12528123/HDFS-2982.004.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 javadoc. The javadoc tool appears to have generated 2 warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal:

          org.apache.hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2480//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2480//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528123/HDFS-2982.004.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 javadoc. The javadoc tool appears to have generated 2 warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal: org.apache.hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2480//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2480//console This message is automatically generated.
          Hide
          Colin Patrick McCabe added a comment -

          left out a hunk from the previous patch (that's what I get for posting from a train I guess)

          Show
          Colin Patrick McCabe added a comment - left out a hunk from the previous patch (that's what I get for posting from a train I guess)
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12528117/HDFS-2982.003.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified test files.

          -1 javac. The patch appears to cause the build to fail.

          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2478//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528117/HDFS-2982.003.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified test files. -1 javac. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2478//console This message is automatically generated.
          Hide
          Colin Patrick McCabe added a comment -

          rebase

          Show
          Colin Patrick McCabe added a comment - rebase
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12528009/HDFS-2982.002.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified test files.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2471//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528009/HDFS-2982.002.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified test files. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2471//console This message is automatically generated.
          Hide
          Colin Patrick McCabe added a comment -
          • EditLogInputStream#init: use IOUtils.cleanup
          • edit log validation: stop validating at the earliest unreadable txid (i.e., the old behavior.) The new behavior can be introduced by HDFS-3049.
          • JournalSet#selectInputStreams: improve doxygen
          • remove readAllEdits since it's deadcode (it belongs with HDFS-3049)
          Show
          Colin Patrick McCabe added a comment - EditLogInputStream#init: use IOUtils.cleanup edit log validation: stop validating at the earliest unreadable txid (i.e., the old behavior.) The new behavior can be introduced by HDFS-3049 . JournalSet#selectInputStreams: improve doxygen remove readAllEdits since it's deadcode (it belongs with HDFS-3049 )
          Hide
          Colin Patrick McCabe added a comment -

          The javadoc for JournalSet#selectInputStreams is a little over-simplified =) - how about describing the algorithm (get the streams starting with fromTxid from all managers, return a list sorted by the starting txid etc)

          Ok, will add.

          In EditLogFileInputStream#init why only close the stream that threw?

          Yeah, I guess closing an already closed stream should be idempotent, at least if they're correctly implementing the Closable interface.

          In TestEditLog readAllEdits is dead code

          ok

          How about describing the high-level approach in the patch?

          From the high level, this patch is about getting rid of two APIs in JournalManager-- getNumberOfTransactions and getInputStream, and adding one API to JournalManager-- selectInputStreams. The new API simply gathers up all the available streams in one go and puts them into a Collection. This is more efficient, and also better for some of the changes we'd like to make in the future, like supporting overlapping edit log streams.

          Edit log validation is the process of finding out how far in-progress edit logs go. We do it during edit log finalization so that we can find out what to rename the in-progress edit log file to. ("validation" might not be a great name for this process, but it's probably too late to change it now.) We don't validate finalized logs.

          There are some minor changes to validation here, and a major change.

          First, the minor changes. One change is to have the validation class contain only the end txid, rather than the start txid, number of txids, and end txid. The start txid is already known, and the number of txids does not represent what you might think, but merely end - start + 1. So it's good to get rid of that cruft. Another minor change is that EditLogValidation#corruptionDetected was renamed to EditLogValidation#hasCorruptHeader. That is the concept it always represented-- it never referred to anything other than header corruption, and the rest of the code even uses the terminology hasCorruptHeader to represent this info (see EditLogFile#hasCorruptHeader). So I'm just trying to be consistent.

          The major change is that we now read to the end of a corrupt file in validation, finding the true end transaction rather than merely the first unreadable txid. This is needed for recovery to work properly on these files. It's possible that this change could be dropped from this patch. Conceptually, it's more related to HDFS-3049.

          Show
          Colin Patrick McCabe added a comment - The javadoc for JournalSet#selectInputStreams is a little over-simplified =) - how about describing the algorithm (get the streams starting with fromTxid from all managers, return a list sorted by the starting txid etc) Ok, will add. In EditLogFileInputStream#init why only close the stream that threw? Yeah, I guess closing an already closed stream should be idempotent, at least if they're correctly implementing the Closable interface. In TestEditLog readAllEdits is dead code ok How about describing the high-level approach in the patch? From the high level, this patch is about getting rid of two APIs in JournalManager-- getNumberOfTransactions and getInputStream, and adding one API to JournalManager-- selectInputStreams. The new API simply gathers up all the available streams in one go and puts them into a Collection. This is more efficient, and also better for some of the changes we'd like to make in the future, like supporting overlapping edit log streams. Edit log validation is the process of finding out how far in-progress edit logs go. We do it during edit log finalization so that we can find out what to rename the in-progress edit log file to. ("validation" might not be a great name for this process, but it's probably too late to change it now.) We don't validate finalized logs. There are some minor changes to validation here, and a major change. First, the minor changes. One change is to have the validation class contain only the end txid, rather than the start txid, number of txids, and end txid. The start txid is already known, and the number of txids does not represent what you might think, but merely end - start + 1. So it's good to get rid of that cruft. Another minor change is that EditLogValidation#corruptionDetected was renamed to EditLogValidation#hasCorruptHeader. That is the concept it always represented-- it never referred to anything other than header corruption, and the rest of the code even uses the terminology hasCorruptHeader to represent this info (see EditLogFile#hasCorruptHeader). So I'm just trying to be consistent. The major change is that we now read to the end of a corrupt file in validation, finding the true end transaction rather than merely the first unreadable txid. This is needed for recovery to work properly on these files. It's possible that this change could be dropped from this patch. Conceptually, it's more related to HDFS-3049 .
          Hide
          Colin Patrick McCabe added a comment -

          There are lots and lots of unit tests would have to change if EditLogInputStream started requiring an init() call. Not to mention the subtle bugs that might crop up. That alone would almost be worth its own patch. Let's deal with this later if we decide it's something worth doing. Frankly, I would argue against it because I think there's better APIs we could design. In particular, an API which separates the concept of a stream from the concept of a stream location is much more efficient and results in cleaner code, because the invariant that you can't use something without initializing it is then enforced by the type system. So basically, can we revisit this idea later, as in after this week?

          The new test case is missing the @Test annotation so it won't run.

          Will fix.

          Are the changes to validateEditLog necessary here? And the change to how corrupt files are handled?

          It's often really time consuming to change these things because then I have to redo all the unit tests. Still, I will take a look at it.

          Show
          Colin Patrick McCabe added a comment - There are lots and lots of unit tests would have to change if EditLogInputStream started requiring an init() call. Not to mention the subtle bugs that might crop up. That alone would almost be worth its own patch. Let's deal with this later if we decide it's something worth doing. Frankly, I would argue against it because I think there's better APIs we could design. In particular, an API which separates the concept of a stream from the concept of a stream location is much more efficient and results in cleaner code, because the invariant that you can't use something without initializing it is then enforced by the type system. So basically, can we revisit this idea later, as in after this week? The new test case is missing the @Test annotation so it won't run. Will fix. Are the changes to validateEditLog necessary here? And the change to how corrupt files are handled? It's often really time consuming to change these things because then I have to redo all the unit tests. Still, I will take a look at it.
          Hide
          Todd Lipcon added a comment -

          Hi Colin. Many of my comments from HDFS-3049 still apply (eg about the lazy initialization of the reader stream)

          Are the changes to validateEditLog necessary here? And the change to how corrupt files are handled? It seems like they fit more appropriately into HDFS-3049. I think you should be able to separate those out from this performance fix.

          The new test case is missing the @Test annotation so it won't run.

          Show
          Todd Lipcon added a comment - Hi Colin. Many of my comments from HDFS-3049 still apply (eg about the lazy initialization of the reader stream) Are the changes to validateEditLog necessary here? And the change to how corrupt files are handled? It seems like they fit more appropriately into HDFS-3049 . I think you should be able to separate those out from this performance fix. The new test case is missing the @Test annotation so it won't run.
          Hide
          Eli Collins added a comment -

          Hey Colin,

          Took a quick look. How about describing the high-level approach in the patch?

          • The javadoc for JournalSet#selectInputStreams is a little over-simplified =) - how about describing the algorithm (get the streams starting with fromTxid from all managers, return a list sorted by the starting txid etc)
          • In EditLogFileInputStream#init why only close the stream that threw?
          • In TestEditLog readAllEdits is dead code
          Show
          Eli Collins added a comment - Hey Colin, Took a quick look. How about describing the high-level approach in the patch? The javadoc for JournalSet#selectInputStreams is a little over-simplified =) - how about describing the algorithm (get the streams starting with fromTxid from all managers, return a list sorted by the starting txid etc) In EditLogFileInputStream#init why only close the stream that threw? In TestEditLog readAllEdits is dead code
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12527992/HDFS-2982.001.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 javadoc. The javadoc tool appears to have generated 2 warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal:

          org.apache.hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2468//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2468//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12527992/HDFS-2982.001.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 javadoc. The javadoc tool appears to have generated 2 warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal: org.apache.hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2468//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2468//console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          With 9000 files over NFS it took 20 minutes to call "selectInputStream". I believe the runtime ends up being O calls to listFiles, each of which is O runtime, which makes the startup time grow quadratically with number of edits logs.

          Show
          Todd Lipcon added a comment - With 9000 files over NFS it took 20 minutes to call "selectInputStream". I believe the runtime ends up being O calls to listFiles, each of which is O runtime, which makes the startup time grow quadratically with number of edits logs.

            People

            • Assignee:
              Colin Patrick McCabe
              Reporter:
              Todd Lipcon
            • Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development