Details

    • Type: Sub-task Sub-task
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.21.0
    • Fix Version/s: 0.23.0
    • Component/s: namenode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      FSImage currently derives from Storage and FSEditLog has to call methods directly on FSImage to access the filesystem. This JIRA is to separate the Storage class out into NNStorage so that FSEditLog is less dependent on FSImage. From this point, the other parts of the circular dependency should be easy to fix.

      1. HDFS-1557.diff
        157 kB
        Ivan Kelly
      2. HDFS-1557-trunk.diff
        160 kB
        Ivan Kelly
      3. HDFS-1557-branch-0.22.diff
        157 kB
        Ivan Kelly
      4. HDFS-1557-trunk.diff
        160 kB
        Ivan Kelly
      5. HDFS-1557-branch-0.22.diff
        157 kB
        Ivan Kelly
      6. HDFS-1557-trunk.diff
        164 kB
        Ivan Kelly
      7. HDFS-1557.diff
        163 kB
        Ivan Kelly
      8. HDFS-1557.diff
        191 kB
        Ivan Kelly
      9. HDFS-1557.diff
        196 kB
        Ivan Kelly
      10. HDFS-1557.diff
        196 kB
        Ivan Kelly
      11. HDFS-1557.diff
        196 kB
        Ivan Kelly
      12. HDFS-1557.diff
        197 kB
        Ivan Kelly
      13. HDFS-1557.diff
        197 kB
        Ivan Kelly
      14. 1557-suggestions.txt
        6 kB
        Todd Lipcon
      15. HDFS-1557.diff
        197 kB
        Ivan Kelly
      16. HDFS-1557.diff
        197 kB
        Ivan Kelly

        Issue Links

          Activity

          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #643 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/)

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #643 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/ )
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #539 (See https://hudson.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/539/)

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #539 (See https://hudson.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/539/ )
          Hide
          Ivan Kelly added a comment -

          That's great Thanks for your help on this and congrats on your first commit!

          Show
          Ivan Kelly added a comment - That's great Thanks for your help on this and congrats on your first commit!
          Hide
          Jitendra Nath Pandey added a comment -

          I just committed this. Thanks to Ivan!

          Show
          Jitendra Nath Pandey added a comment - I just committed this. Thanks to Ivan!
          Hide
          Jitendra Nath Pandey added a comment -

          Correction in previous comment: Release audit warnings are in OfflineEditsViewer.

          Show
          Jitendra Nath Pandey added a comment - Correction in previous comment: Release audit warnings are in OfflineEditsViewer.
          Hide
          Jitendra Nath Pandey added a comment -

          The three failed tests and test-contrib also fail without the patch. Release audit warnings are due to OfflineImageViewer and are unrelated to this patch.

          Show
          Jitendra Nath Pandey added a comment - The three failed tests and test-contrib also fail without the patch. Release audit warnings are due to OfflineImageViewer and are unrelated to this patch.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12469966/HDFS-1557.diff
          against trunk revision 1065960.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 42 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          -1 release audit. The applied patch generated 2 release audit warnings (more than the trunk's current 0 warnings).

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.hdfs.security.TestClientProtocolWithDelegationToken
          org.apache.hadoop.hdfs.security.token.block.TestBlockToken
          org.apache.hadoop.hdfs.server.namenode.TestStorageRestore

          -1 contrib tests. The patch failed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/145//testReport/
          Release audit warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/145//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/145//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/145//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12469966/HDFS-1557.diff against trunk revision 1065960. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 42 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. -1 release audit. The applied patch generated 2 release audit warnings (more than the trunk's current 0 warnings). -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.security.TestClientProtocolWithDelegationToken org.apache.hadoop.hdfs.security.token.block.TestBlockToken org.apache.hadoop.hdfs.server.namenode.TestStorageRestore -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/145//testReport/ Release audit warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/145//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/145//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/145//console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          +1 from me, looks great.

          Show
          Todd Lipcon added a comment - +1 from me, looks great.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12469947/1557-suggestions.txt
          against trunk revision 1065960.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/142//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12469947/1557-suggestions.txt against trunk revision 1065960. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/142//console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          Hi Ivan. Here are a few more suggested changes (minor cosmetic fixes) and also a couple of notes in the code where some things looked strange - I figured it was easier to write these notes in patch form.

          Show
          Todd Lipcon added a comment - Hi Ivan. Here are a few more suggested changes (minor cosmetic fixes) and also a couple of notes in the code where some things looked strange - I figured it was easier to write these notes in patch form.
          Hide
          Todd Lipcon added a comment -

          JCarder looks pretty good - one potential deadlock but it's an old one I filed months back. So, +1 from me after addressing the nits Jakob and I mentioned above.

          Show
          Todd Lipcon added a comment - JCarder looks pretty good - one potential deadlock but it's an old one I filed months back. So, +1 from me after addressing the nits Jakob and I mentioned above.
          Hide
          Todd Lipcon added a comment -

          I assume Suresh's comments were supposed to go to another JIRA?

          I agree with Jakob on the naming for errorStreams()

          Another small nit:

          • getStorageDirectoryForStream javadoc: "stream to remove" isn't really a good description of this parameter, since the method itself doesn't remove anything.

          Aside from that, it looks good - I pored over the diff for a while and it seems reasonable. I imagine as we work in the new code we'll find some improvements we can make, but overall I think the structure is improved.

          I'm also running the unit tests through JCarder to make sure we don't have any new lock inversions. Should be done by morning.

          Show
          Todd Lipcon added a comment - I assume Suresh's comments were supposed to go to another JIRA? I agree with Jakob on the naming for errorStreams() Another small nit: getStorageDirectoryForStream javadoc: "stream to remove" isn't really a good description of this parameter, since the method itself doesn't remove anything. Aside from that, it looks good - I pored over the diff for a while and it seems reasonable. I imagine as we work in the new code we'll find some improvements we can make, but overall I think the structure is improved. I'm also running the unit tests through JCarder to make sure we don't have any new lock inversions. Should be done by morning.
          Hide
          Suresh Srinivas added a comment -

          Most of the comments are minor:

          1. AbstractFileSystem.java - remove unused import IdentityHashMap
          2. AbstractFileSystem#getStatistics() - can you make only the inner part that accesses HashMap synchronized, excluding validation of URI.
          3. minor: remove empty line changes before AbstractFileSystem#getStatistics() method.
          4. STATISTICS_TABLE declaration goes beyond 80 columns
          5. FileContext#rename() can you change the link in javadoc from Rename#OVERWRITE to Options.Rename#OVERWRITE to fix a javadoc warning
          6. FileContext.java - the newly added methods need not be synchronized as it synchronization is handled by AbstractFileSystem.
          7. FileContext#getAllStatistics() in javadoc unnecessarily indicates URISyntaxException is thrown.
          8. FileContext#clearStatistics() - javadoc should indicate this method clears statistics for all the file systems.
          9. FileContext#getStatistics() - for @param uri, have a fullstop after "the uri to lookup the statistics".
          10. FileContext#printStatistics() - why does this method throw IOException? Remove it both from javadoc and declaration.
          11. Complete the @param argument for new copy constructor in Statistics.
          12. FCStatisticsBaseTest - "Base class to test File Context Statistics." instead of File Context add link to FileContext.
          13. FCStatisticsBaseTest - why do you need exact URI from getSchemeAuthorityUri()? The API handles any URI by picking scheme and authority right?
          14. TestLocalFsFCStatistics - please add comment on why you are doing blockSize + 12? Also please change the class javadoc to indicate that this is testing stats for LocalFs.
          Show
          Suresh Srinivas added a comment - Most of the comments are minor: AbstractFileSystem.java - remove unused import IdentityHashMap AbstractFileSystem#getStatistics() - can you make only the inner part that accesses HashMap synchronized, excluding validation of URI. minor: remove empty line changes before AbstractFileSystem#getStatistics() method. STATISTICS_TABLE declaration goes beyond 80 columns FileContext#rename() can you change the link in javadoc from Rename#OVERWRITE to Options.Rename#OVERWRITE to fix a javadoc warning FileContext.java - the newly added methods need not be synchronized as it synchronization is handled by AbstractFileSystem. FileContext#getAllStatistics() in javadoc unnecessarily indicates URISyntaxException is thrown. FileContext#clearStatistics() - javadoc should indicate this method clears statistics for all the file systems. FileContext#getStatistics() - for @param uri, have a fullstop after "the uri to lookup the statistics". FileContext#printStatistics() - why does this method throw IOException? Remove it both from javadoc and declaration. Complete the @param argument for new copy constructor in Statistics. FCStatisticsBaseTest - "Base class to test File Context Statistics." instead of File Context add link to FileContext. FCStatisticsBaseTest - why do you need exact URI from getSchemeAuthorityUri()? The API handles any URI by picking scheme and authority right? TestLocalFsFCStatistics - please add comment on why you are doing blockSize + 12? Also please change the class javadoc to indicate that this is testing stats for LocalFs.
          Hide
          Suresh Srinivas added a comment -

          Most of the comments are minor:

          1. AbstractFileSystem.java - remove unused import IdentityHashMap
          2. AbstractFileSystem#getStatistics() - can you make only the inner part that accesses HashMap synchronized, excluding validation of URI.
          3. minor: remove empty line changes before AbstractFileSystem#getStatistics() method.
          4. STATISTICS_TABLE declaration goes beyond 80 columns
          5. FileContext#rename() can you change the link in javadoc from Rename#OVERWRITE to Options.Rename#OVERWRITE to fix a javadoc warning
          6. FileContext.java - the newly added methods need not be synchronized as it synchronization is handled by AbstractFileSystem.
          7. FileContext#getAllStatistics() in javadoc unnecessarily indicates URISyntaxException is thrown.
          8. FileContext#clearStatistics() - javadoc should indicate this method clears statistics for all the file systems.
          9. FileContext#getStatistics() - for @param uri, have a fullstop after "the uri to lookup the statistics".
          10. FileContext#printStatistics() - why does this method throw IOException? Remove it both from javadoc and declaration.
          11. Complete the @param argument for new copy constructor in Statistics.
          12. FCStatisticsBaseTest - "Base class to test File Context Statistics." instead of File Context add link to FileContext.
          13. FCStatisticsBaseTest - why do you need exact URI from getSchemeAuthorityUri()? The API handles any URI by picking scheme and authority right?
          14. TestLocalFsFCStatistics - please add comment on why you are doing blockSize + 12? Also please change the class javadoc to indicate that this is testing stats for LocalFs.
          Show
          Suresh Srinivas added a comment - Most of the comments are minor: AbstractFileSystem.java - remove unused import IdentityHashMap AbstractFileSystem#getStatistics() - can you make only the inner part that accesses HashMap synchronized, excluding validation of URI. minor: remove empty line changes before AbstractFileSystem#getStatistics() method. STATISTICS_TABLE declaration goes beyond 80 columns FileContext#rename() can you change the link in javadoc from Rename#OVERWRITE to Options.Rename#OVERWRITE to fix a javadoc warning FileContext.java - the newly added methods need not be synchronized as it synchronization is handled by AbstractFileSystem. FileContext#getAllStatistics() in javadoc unnecessarily indicates URISyntaxException is thrown. FileContext#clearStatistics() - javadoc should indicate this method clears statistics for all the file systems. FileContext#getStatistics() - for @param uri, have a fullstop after "the uri to lookup the statistics". FileContext#printStatistics() - why does this method throw IOException? Remove it both from javadoc and declaration. Complete the @param argument for new copy constructor in Statistics. FCStatisticsBaseTest - "Base class to test File Context Statistics." instead of File Context add link to FileContext. FCStatisticsBaseTest - why do you need exact URI from getSchemeAuthorityUri()? The API handles any URI by picking scheme and authority right? TestLocalFsFCStatistics - please add comment on why you are doing blockSize + 12? Also please change the class javadoc to indicate that this is testing stats for LocalFs.
          Hide
          Jakob Homan added a comment -

          This is a very difficult patch to review due to the combination of automated refactoring combined with the introduction of new classes. I imagine there will be many more comments after this goes in and we're proceeding down this road. For now, I'd like to see the synchronized void errorStreams(List<EditLogOutputStream> errorStreams) method renamed to something more descriptive. Otherwise, it looks reasonable.

          Show
          Jakob Homan added a comment - This is a very difficult patch to review due to the combination of automated refactoring combined with the introduction of new classes. I imagine there will be many more comments after this goes in and we're proceeding down this road. For now, I'd like to see the synchronized void errorStreams(List<EditLogOutputStream> errorStreams) method renamed to something more descriptive. Otherwise, it looks reasonable.
          Hide
          Ivan Kelly added a comment -

          Addressed Suresh's latest comments. Haven't done any local testing with this patch though, as trunk isn't compiling due to a change in hadoop common i think.

          Show
          Ivan Kelly added a comment - Addressed Suresh's latest comments. Haven't done any local testing with this patch though, as trunk isn't compiling due to a change in hadoop common i think.
          Hide
          Todd Lipcon added a comment -

          Hey Ivan. Now that Suresh and Jitendra have taken a look, I'd like to take a quick look too at the final version - can you give me the weekend to review before committing? Thanks.

          Show
          Todd Lipcon added a comment - Hey Ivan. Now that Suresh and Jitendra have taken a look, I'd like to take a quick look too at the final version - can you give me the weekend to review before committing? Thanks.
          Hide
          Suresh Srinivas added a comment -

          Missed in my previous comment:
          Change LOG.info in NNStorage#setRestoreFailedStorage() to LOG.warn (it seems more appropriate)

          Show
          Suresh Srinivas added a comment - Missed in my previous comment: Change LOG.info in NNStorage#setRestoreFailedStorage() to LOG.warn (it seems more appropriate)
          Hide
          Suresh Srinivas added a comment -

          Some minor comments:

          1. add listeners.clear() to NNStorage#close()
          2. BackupImage - In javadoc of the class typo "Extention" to "Extension". Remove empty cmment on BackupImage constructor.
          3. FSImage.java has many unnecessary imports.

          With this addressed, +1 for the patch.

          Show
          Suresh Srinivas added a comment - Some minor comments: add listeners.clear() to NNStorage#close() BackupImage - In javadoc of the class typo "Extention" to "Extension". Remove empty cmment on BackupImage constructor. FSImage.java has many unnecessary imports. With this addressed, +1 for the patch.
          Hide
          Ivan Kelly added a comment -

          That findbug wasn't actually a bug. I was intentionally synchronizing on that array as if the attemptRestore runs multiple times in sequence, a directory may be readded multiple times and I didn't want to lock the whole NNStorage object. Added a new member just for locking now to avoid the findbugs error.

          Show
          Ivan Kelly added a comment - That findbug wasn't actually a bug. I was intentionally synchronizing on that array as if the attemptRestore runs multiple times in sequence, a directory may be readded multiple times and I didn't want to lock the whole NNStorage object. Added a new member just for locking now to avoid the findbugs error.
          Hide
          Jitendra Nath Pandey added a comment -

          I ran test-patch on the latest patch.
          There is a findbug error
          Synchronization performed on java.util.concurrent.CopyOnWriteArrayList in org.apache.hadoop.hdfs.server.namenode.NNStorage.attemptRestoreRemovedStorage(boolean)

          at line
          synchronized (this.removedStorageDirs) in NNStorage

          Overall test-patch results were

          [exec] -1 overall.
          [exec]
          [exec] +1 @author. The patch does not contain any @author tags.
          [exec]
          [exec] +1 tests included. The patch appears to include 42 new or modified tests.
          [exec]
          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
          [exec]
          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
          [exec]
          [exec] -1 findbugs. The patch appears to introduce 1 new Findbugs warnings.
          [exec]
          [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
          [exec]
          [exec] +1 system tests framework. The patch passed system tests framework compile.

          Show
          Jitendra Nath Pandey added a comment - I ran test-patch on the latest patch. There is a findbug error Synchronization performed on java.util.concurrent.CopyOnWriteArrayList in org.apache.hadoop.hdfs.server.namenode.NNStorage.attemptRestoreRemovedStorage(boolean) at line synchronized (this.removedStorageDirs) in NNStorage Overall test-patch results were [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 42 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] -1 findbugs. The patch appears to introduce 1 new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 system tests framework. The patch passed system tests framework compile.
          Hide
          Ivan Kelly added a comment -

          Addressed Suresh's comments.

          NNStorage use synchronized method errorDirectory() to notify listeners of error. The listener implement synchrnonized method to handle the error. Is it possible for listeners (say FSImage) to be calling from its synchronized section, a synchronized method on NNStorage? This could cause dead locks.

          NNStorage now using CopyOnWriteArrayLists now, so errorDirectory is no longer synchronised. (see below)

          format(), registerListener() should this be synchronized as it manipulates listeners?

          listeners is now a CopyOnWriteArrayList.

          Storage#storageDirs are manipulated in NNStorage and Storage. The way it is done is not thread safe. Perhaps the existing code is thread safe it self. This could be addressed in a separate jira.

          Well spotted. I dont think any of this is threadsafe, given that storageDirs is modified in numerous places, and is being constantly being iterated over which could trigger a concurrent modification exception.

          I've made storageDirs and removedStorageDirs a CopyOnWriteArrayList now.

          Consider making the following method package private: isPreUpgradableLayout(), setRestoreFailedStorage() (both variants), attemptResotreRemovedStorage()... The are other methods that could only be used with in the package. This makes sure this is not a class for outside consumption. I would further consider making NNStorage non public class.

          Tightened up all the access privileges which I could on that class now to package private. Unfortunately, NNStorage itself must remain public because of UpgradeUtilities in testing being in a different package.

          Show
          Ivan Kelly added a comment - Addressed Suresh's comments. NNStorage use synchronized method errorDirectory() to notify listeners of error. The listener implement synchrnonized method to handle the error. Is it possible for listeners (say FSImage) to be calling from its synchronized section, a synchronized method on NNStorage? This could cause dead locks. NNStorage now using CopyOnWriteArrayLists now, so errorDirectory is no longer synchronised. (see below) format(), registerListener() should this be synchronized as it manipulates listeners? listeners is now a CopyOnWriteArrayList. Storage#storageDirs are manipulated in NNStorage and Storage. The way it is done is not thread safe. Perhaps the existing code is thread safe it self. This could be addressed in a separate jira. Well spotted. I dont think any of this is threadsafe, given that storageDirs is modified in numerous places, and is being constantly being iterated over which could trigger a concurrent modification exception. I've made storageDirs and removedStorageDirs a CopyOnWriteArrayList now. Consider making the following method package private: isPreUpgradableLayout(), setRestoreFailedStorage() (both variants), attemptResotreRemovedStorage()... The are other methods that could only be used with in the package. This makes sure this is not a class for outside consumption. I would further consider making NNStorage non public class. Tightened up all the access privileges which I could on that class now to package private. Unfortunately, NNStorage itself must remain public because of UpgradeUtilities in testing being in a different package.
          Hide
          Suresh Srinivas added a comment -

          Some comments:

          1. Storage#deleteDir() javadoc should read, recursively delete all the content of the directory first and then the directory
          2. General comment: Add javadoc at the class level to all the new classes introduced.
          3. NNStorage.java
            • StorageListener could be renamed to NNStorageListener. The existing name makes it sound like it works with Storage.
            • NNStorage use synchronized method errorDirectory() to notify listeners of error. The listener implement synchrnonized method to handle the error. Is it possible for listeners (say FSImage) to be calling from its synchronized section, a synchronized method on NNStorage? This could cause dead locks.kk
            • setCheckpointTime - javadoc reference @see setCheckpointTimeInStorage is invalid.
            • remove unneeded import StorageDirectory.
            • make member listeners, removedStorageDirs final
            • Add to methods with @Override what super class it is overriding
            • format(), registerListener() should this be synchronized as it manipulates listeners?
            • Storage#storageDirs are manipulated in NNStorage and Storage. The way it is done is not thread safe. Perhaps the existing code is thread safe it self. This could be addressed in a separate jira.
            • Consider making the following method package private: isPreUpgradableLayout(), setRestoreFailedStorage() (both variants), attemptResotreRemovedStorage()... The are other methods that could only be used with in the package. This makes sure this is not a class for outside consumption. I would further consider making NNStorage non public class.
          4. BackupImage.java
            • Instead of storage.getStorageFile(...), consider using static access Storage.getStorageFile(...) to avoid a warning.
          Show
          Suresh Srinivas added a comment - Some comments: Storage#deleteDir() javadoc should read, recursively delete all the content of the directory first and then the directory General comment: Add javadoc at the class level to all the new classes introduced. NNStorage.java StorageListener could be renamed to NNStorageListener. The existing name makes it sound like it works with Storage. NNStorage use synchronized method errorDirectory() to notify listeners of error. The listener implement synchrnonized method to handle the error. Is it possible for listeners (say FSImage) to be calling from its synchronized section, a synchronized method on NNStorage? This could cause dead locks.kk setCheckpointTime - javadoc reference @see setCheckpointTimeInStorage is invalid. remove unneeded import StorageDirectory. make member listeners, removedStorageDirs final Add to methods with @Override what super class it is overriding format(), registerListener() should this be synchronized as it manipulates listeners? Storage#storageDirs are manipulated in NNStorage and Storage. The way it is done is not thread safe. Perhaps the existing code is thread safe it self. This could be addressed in a separate jira. Consider making the following method package private: isPreUpgradableLayout(), setRestoreFailedStorage() (both variants), attemptResotreRemovedStorage()... The are other methods that could only be used with in the package. This makes sure this is not a class for outside consumption. I would further consider making NNStorage non public class. BackupImage.java Instead of storage.getStorageFile(...), consider using static access Storage.getStorageFile(...) to avoid a warning.
          Hide
          Ivan Kelly added a comment -

          According to the checklist, its Sun convention which actually says both are good.

          http://wiki.apache.org/hadoop/CodeReviewChecklist
          http://www.oracle.com/technetwork/java/codeconventions-136091.html#262

          From the Sun document.

          • Align the new line with the beginning of the expression at the same level on the previous line.
          • If the above rules lead to confusing code or to code that's squished up against the right margin, just indent 8 spaces instead.

          It seems that emacs defaults to the former, while eclipse defaults to the latter. In the cases in which everything was squished to the right, I've applied the eclipse form.

          I've removed all the 80+ lines I introduced.

          Show
          Ivan Kelly added a comment - According to the checklist, its Sun convention which actually says both are good. http://wiki.apache.org/hadoop/CodeReviewChecklist http://www.oracle.com/technetwork/java/codeconventions-136091.html#262 From the Sun document. Align the new line with the beginning of the expression at the same level on the previous line. If the above rules lead to confusing code or to code that's squished up against the right margin, just indent 8 spaces instead. It seems that emacs defaults to the former, while eclipse defaults to the latter. In the cases in which everything was squished to the right, I've applied the eclipse form. I've removed all the 80+ lines I introduced.
          Hide
          Jitendra Nath Pandey added a comment -

          Couple of things I missed before

          FSEditLog.java

          • In errorOccurred method, the LOG.error is being printed for all editStreams.
          • In formatOccurred, getStorageFile is static, but is being accessed non-statically.
          Show
          Jitendra Nath Pandey added a comment - Couple of things I missed before FSEditLog.java In errorOccurred method, the LOG.error is being printed for all editStreams. In formatOccurred, getStorageFile is static, but is being accessed non-statically.
          Hide
          Jitendra Nath Pandey added a comment -

          1. Storage.java

          • Since checkVersionUpgradable and deleteDir are now public, please add javadoc for them as well.

          2. FSEditLog.java

          • References to fsimage can be removed and we don't need to pass fsimage in the constructor.

          3. NNStorage.java

          • In many public methods javadoc doesn't capture the parameters and/or exceptions for example
            errorDirectory or errorDirectories.

          >> Can we make checkpointTime private?
          > .. However, there is already a setCheckpointTime which writes the checkpoint time to the storage directories at the same time. ..
          Can we rename the existing method to something like setCheckpointTimeInStorage and redefine the setCheckpointTime to just set the variable? This variable was protected in FSImage earlier, we should avoid making it public.

          Comments related to indentation:

          1. Lines more than 80 chars in a few files for example
          -Namenode.java, NamenodeJspHelper.java, FSEditLog.java, SecondaryNameNode.java

          >What is the standard? Is it codified anywhere? The indentation on these looks fine to me now. There were a few tabs hanging about, but its all 2 space now.

          I believe the convention is 2 spaces per level and 4 spaces if a statement continues on the next line. For example in the following statement next line starts with 4 spaces.
          new IOException("xyz" +
          "abc")
          Usually eclipse formats it this way once rules are set.
          Other details about the code formatting rules can be found on hadoop wiki.

          I pointed out for indentation because I noticed lots of spaces for statements continued on new line, at a few places in the patch.

          Show
          Jitendra Nath Pandey added a comment - 1. Storage.java Since checkVersionUpgradable and deleteDir are now public, please add javadoc for them as well. 2. FSEditLog.java References to fsimage can be removed and we don't need to pass fsimage in the constructor. 3. NNStorage.java In many public methods javadoc doesn't capture the parameters and/or exceptions for example errorDirectory or errorDirectories. >> Can we make checkpointTime private? > .. However, there is already a setCheckpointTime which writes the checkpoint time to the storage directories at the same time. .. Can we rename the existing method to something like setCheckpointTimeInStorage and redefine the setCheckpointTime to just set the variable? This variable was protected in FSImage earlier, we should avoid making it public. Comments related to indentation: 1. Lines more than 80 chars in a few files for example -Namenode.java, NamenodeJspHelper.java, FSEditLog.java, SecondaryNameNode.java >What is the standard? Is it codified anywhere? The indentation on these looks fine to me now. There were a few tabs hanging about, but its all 2 space now. I believe the convention is 2 spaces per level and 4 spaces if a statement continues on the next line. For example in the following statement next line starts with 4 spaces. new IOException("xyz" + "abc") Usually eclipse formats it this way once rules are set. Other details about the code formatting rules can be found on hadoop wiki. I pointed out for indentation because I noticed lots of spaces for statements continued on new line, at a few places in the patch.
          Hide
          Todd Lipcon added a comment -

          Hi Ivan. FYi TestConcurrentFileReader is definitely flaky on trunk these days.

          Show
          Todd Lipcon added a comment - Hi Ivan. FYi TestConcurrentFileReader is definitely flaky on trunk these days.
          Hide
          Ivan Kelly added a comment -

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.hdfs.server.namenode.TestBackupNode
          org.apache.hadoop.hdfs.server.namenode.TestStorageRestore
          org.apache.hadoop.hdfs.TestFileConcurrentReader

          -1 contrib tests. The patch failed contrib unit tests.

          TestStorageRestore is failing on trunk. See HDFS-1496.

          contrib tests fail due to:

          [exec] /grid/0/hudson/hudson-slave/workspace/PreCommit-HDFS-Build/trunk/src/contrib/hdfsproxy/build.xml:294: org.codehaus.cargo.container.ContainerException: Failed to download http://apache.osuosl.org/tomcat/tomcat-6/v6.0.24/bin/apache-tomcat-6.0.24.zip

          This happens on trunk too.

          TestBackupNode fails with a JVM crash. Passes completely on a local run.

          TestConcurrentFileReader fails with the OS running out of filehandles. Passes completely on a local run.

          I've kicked hudson to make it run the tests again, but it hasn't run yet. Not worried about TestBackupNode, but would like to see TestConcurrentFileReader run cleanly on hudson to be a little surer it was just a sporadic fail.

          Show
          Ivan Kelly added a comment - -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.server.namenode.TestBackupNode org.apache.hadoop.hdfs.server.namenode.TestStorageRestore org.apache.hadoop.hdfs.TestFileConcurrentReader -1 contrib tests. The patch failed contrib unit tests. TestStorageRestore is failing on trunk. See HDFS-1496 . contrib tests fail due to: [exec] /grid/0/hudson/hudson-slave/workspace/PreCommit-HDFS-Build/trunk/src/contrib/hdfsproxy/build.xml:294: org.codehaus.cargo.container.ContainerException: Failed to download http://apache.osuosl.org/tomcat/tomcat-6/v6.0.24/bin/apache-tomcat-6.0.24.zip This happens on trunk too. TestBackupNode fails with a JVM crash. Passes completely on a local run. TestConcurrentFileReader fails with the OS running out of filehandles. Passes completely on a local run. I've kicked hudson to make it run the tests again, but it hasn't run yet. Not worried about TestBackupNode, but would like to see TestConcurrentFileReader run cleanly on hudson to be a little surer it was just a sporadic fail.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12468578/HDFS-1557.diff
          against trunk revision 1059508.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 39 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.hdfs.server.namenode.TestBackupNode
          org.apache.hadoop.hdfs.server.namenode.TestStorageRestore
          org.apache.hadoop.hdfs.TestFileConcurrentReader

          -1 contrib tests. The patch failed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/113//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/113//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/113//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12468578/HDFS-1557.diff against trunk revision 1059508. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 39 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.server.namenode.TestBackupNode org.apache.hadoop.hdfs.server.namenode.TestStorageRestore org.apache.hadoop.hdfs.TestFileConcurrentReader -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/113//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/113//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/113//console This message is automatically generated.
          Hide
          Ivan Kelly added a comment -

          Forgot to readd NNStorage after applying to trunk.

          Show
          Ivan Kelly added a comment - Forgot to readd NNStorage after applying to trunk.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12468575/HDFS-1557.diff
          against trunk revision 1059508.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 39 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 javac. The patch appears to cause tar ant target to fail.

          -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:

          -1 contrib tests. The patch failed contrib unit tests.

          -1 system test framework. The patch failed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/112//testReport/
          Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/112//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12468575/HDFS-1557.diff against trunk revision 1059508. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 39 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The patch appears to cause tar ant target to fail. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: -1 contrib tests. The patch failed contrib unit tests. -1 system test framework. The patch failed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/112//testReport/ Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/112//console This message is automatically generated.
          Hide
          Ivan Kelly added a comment -
          • Please also comment on the purpose of StorageListener interface in this jira.
            As I understand from the patch, upgradeManager is set in NNStorage so that NNStorage has no other dependency on FSNamesystem. Is that correct? A comment about this would be helpful.

          Yes, this is correct. I think a comment to this affect in the code may be confusing to a reader who'd wonder why Storage would require a reference to FSNameSystem in the first place.

          3. NNStorage.java

          • Most of the methods in NNStorage are those removed from FSImage. Is that correct? Please comment on the jira if there are any changes to the semantics for some methods moved from FSImage to NNStorage.

          The following are verbatim copies.
          {{
          enum NameNodeFile
          enum NameNodeDirType
          public void setRestoreFailedStorage(boolean val) {
          public boolean getRestoreFailedStorage() {
          public List<StorageDirectory> getRemovedStorageDirs() {
          synchronized public void setStorageDirectories(Collection<URI> fsNameDirs, Collection<URI> fsEditsDirs)
          public static void checkSchemeConsistency(URI u) throws IOException {
          public Collection<URI> getImageDirectories() throws IOException {
          public Collection<URI> getEditsDirectories() throws IOException {
          public int getNumStorageDirs(NameNodeDirType dirType) {
          public Collection<URI> getDirectories(NameNodeDirType dirType)
          public long readCheckpointTime(StorageDirectory sd) throws IOException {
          public void writeCheckpointTime(StorageDirectory sd) throws IOException {
          public void incrementCheckpointTime() {
          public long getCheckpointTime() {
          public void setCheckpointTime(long newCpT) {
          public File[] getFsImageNameCheckpoint() {
          public File getFsImageName() {
          public File getFsTimeName() {
          public void format() throws IOException {
          private int newNamespaceID() {
          protected void moveCurrent(StorageDirectory sd)
          protected void moveLastCheckpoint(StorageDirectory sd)
          protected void getFields(Properties props,
          StorageDirectory sd
          ) throws IOException {
          protected void setFields(Properties props,
          StorageDirectory sd
          ) throws IOException {
          public void setImageDigest(MD5Hash digest) {
          public MD5Hash getImageDigest() {
          }}

          {{
          static public File getImageFile(StorageDirectory sd, NameNodeFile type) {
          }}
          is now
          {{
          static public File getStorageFile(StorageDirectory sd, NameNodeFile type) {
          }}

          {{
          public boolean isPreUpgradableLayout(StorageDirectory sd) throws IOException {
          }}
          is the same except for disablePreUpgradableLayoutCheck used by BackupImage.

          {{
          synchronized public void attemptRestoreRemovedStorage(boolean saveNamespace) {
          }}
          calls directoryAvailable instead of manually readding to edit streams;

          {{
          public File getFsEditName() throws IOException {
          }}
          gives back the filename directly instead of going through FSEditLog as FSImage used to.

          {{
          void format(StorageDirectory sd) throws IOException {
          }}
          uses new formatOccurred interface

          All DistributedUpgrade methods now use a class member, upgradeManager instead of explicitly relying on FSNameSystem.

          • We already have dirIterator in Storage, can we just use that and get rid of iterable method and Iterable interface? Currently, it seems like we have multiple ways of iterating over StorageDirectories.

          Removed. However, I think iterable is prettier to use than dirIterator.

          {{
          for (StorageDirectory sd : storage.iterable(NameNodeDirType.IMAGE)) {
          }}
          is cleaner than.
          {{
          for (Iterator<StorageDirectory> it = storage.dirIterator(NameNodeDirType.IMAGE); it.hasNext() {
          StorageDirectory sd = it.next();
          }}
          Perhaps this could be cleaned up in another JIRA.

          • conf field is never read, do we need it? Removing it will save the changes to UpgradeUtilities and a few tests.

          I think its good to have it passed into NNStorage anyhow, even if it isn't used yet. If anyone wants to put something configurable into NNStorage, which isn't far fetched, either they'll have to pass in the Configuration or call a set<ConfigurableParam> method. The second way breaks the encapsulation of NNStorage, so I'd like to leave the passing in of Configuration to avoid this situation. However, I've removed the unused field.

          • Why is method close synchronized? It only calls unlockAll which is not synchronized, therefore we are not really preventing race in accessing StorageDirectories.

          Copy paste error. Had copied close from FSImage. Removed the synchronised now.

          • Can we make checkpointTime private?

          I did think about this while doing the changes. However, there is already a setCheckpointTime which writes the checkpoint time to the storage directories at the same time. I can create a setCheckpointTimeNoWrite, but thats ugly. I haven't thought about it much, but I haven't come up with a pretty way of naming the method. (changing setCheckpointTime to writeCheckpointTime is also out since that already exists).

          5. FSEditLog.java

          • The comment on errorStreams still talks about boolean propogate which was passed in processIOError. Since we are not passing this variable to errorStreams I guess we are propogating it in all cases. How are we handling the case when propogate is false?

          As FSImage no longer owns the StorageDirectories, propogation no longer exists. Before the change, depending on whether the error occurred during a editlog write or image write, propogation would be set to let the image or editlog know of the error respectively. Since NNStorage now handles errored storage directories and the notification of both the image and edit log, this propogation is no longer needed.

          4. CheckpointSignature.java

          • indentation in the construction of IOException.

          7. FSNamesystem.java

          • Indentation in registerBackupNode and releaseBackupNode

          8. GetImageServlet.java

          • Indentation

          What is the standard? Is it codified anywhere? The indentation on these looks fine to me now. There were a few tabs hanging about, but its all 2 space now.

          9. TestSaveNamespace.java

          • What is the use of FSImageAccessor? setStorage is being called at couple of places but this class is not being used otherwise.

          You should never have to set the Storage on the FSImage after it's been created. However it is useful for testing. FSImageAccessor is here to allow you to do that without polluting the production code.

          Show
          Ivan Kelly added a comment - Please also comment on the purpose of StorageListener interface in this jira. As I understand from the patch, upgradeManager is set in NNStorage so that NNStorage has no other dependency on FSNamesystem. Is that correct? A comment about this would be helpful. Yes, this is correct. I think a comment to this affect in the code may be confusing to a reader who'd wonder why Storage would require a reference to FSNameSystem in the first place. 3. NNStorage.java Most of the methods in NNStorage are those removed from FSImage. Is that correct? Please comment on the jira if there are any changes to the semantics for some methods moved from FSImage to NNStorage. The following are verbatim copies. {{ enum NameNodeFile enum NameNodeDirType public void setRestoreFailedStorage(boolean val) { public boolean getRestoreFailedStorage() { public List<StorageDirectory> getRemovedStorageDirs() { synchronized public void setStorageDirectories(Collection<URI> fsNameDirs, Collection<URI> fsEditsDirs) public static void checkSchemeConsistency(URI u) throws IOException { public Collection<URI> getImageDirectories() throws IOException { public Collection<URI> getEditsDirectories() throws IOException { public int getNumStorageDirs(NameNodeDirType dirType) { public Collection<URI> getDirectories(NameNodeDirType dirType) public long readCheckpointTime(StorageDirectory sd) throws IOException { public void writeCheckpointTime(StorageDirectory sd) throws IOException { public void incrementCheckpointTime() { public long getCheckpointTime() { public void setCheckpointTime(long newCpT) { public File[] getFsImageNameCheckpoint() { public File getFsImageName() { public File getFsTimeName() { public void format() throws IOException { private int newNamespaceID() { protected void moveCurrent(StorageDirectory sd) protected void moveLastCheckpoint(StorageDirectory sd) protected void getFields(Properties props, StorageDirectory sd ) throws IOException { protected void setFields(Properties props, StorageDirectory sd ) throws IOException { public void setImageDigest(MD5Hash digest) { public MD5Hash getImageDigest() { }} {{ static public File getImageFile(StorageDirectory sd, NameNodeFile type) { }} is now {{ static public File getStorageFile(StorageDirectory sd, NameNodeFile type) { }} {{ public boolean isPreUpgradableLayout(StorageDirectory sd) throws IOException { }} is the same except for disablePreUpgradableLayoutCheck used by BackupImage. {{ synchronized public void attemptRestoreRemovedStorage(boolean saveNamespace) { }} calls directoryAvailable instead of manually readding to edit streams; {{ public File getFsEditName() throws IOException { }} gives back the filename directly instead of going through FSEditLog as FSImage used to. {{ void format(StorageDirectory sd) throws IOException { }} uses new formatOccurred interface All DistributedUpgrade methods now use a class member, upgradeManager instead of explicitly relying on FSNameSystem. We already have dirIterator in Storage, can we just use that and get rid of iterable method and Iterable interface? Currently, it seems like we have multiple ways of iterating over StorageDirectories. Removed. However, I think iterable is prettier to use than dirIterator. {{ for (StorageDirectory sd : storage.iterable(NameNodeDirType.IMAGE)) { }} is cleaner than. {{ for (Iterator<StorageDirectory> it = storage.dirIterator(NameNodeDirType.IMAGE); it.hasNext() { StorageDirectory sd = it.next(); }} Perhaps this could be cleaned up in another JIRA. conf field is never read, do we need it? Removing it will save the changes to UpgradeUtilities and a few tests. I think its good to have it passed into NNStorage anyhow, even if it isn't used yet. If anyone wants to put something configurable into NNStorage, which isn't far fetched, either they'll have to pass in the Configuration or call a set<ConfigurableParam> method. The second way breaks the encapsulation of NNStorage, so I'd like to leave the passing in of Configuration to avoid this situation. However, I've removed the unused field. Why is method close synchronized? It only calls unlockAll which is not synchronized, therefore we are not really preventing race in accessing StorageDirectories. Copy paste error. Had copied close from FSImage. Removed the synchronised now. Can we make checkpointTime private? I did think about this while doing the changes. However, there is already a setCheckpointTime which writes the checkpoint time to the storage directories at the same time. I can create a setCheckpointTimeNoWrite, but thats ugly. I haven't thought about it much, but I haven't come up with a pretty way of naming the method. (changing setCheckpointTime to writeCheckpointTime is also out since that already exists). 5. FSEditLog.java The comment on errorStreams still talks about boolean propogate which was passed in processIOError. Since we are not passing this variable to errorStreams I guess we are propogating it in all cases. How are we handling the case when propogate is false? As FSImage no longer owns the StorageDirectories, propogation no longer exists. Before the change, depending on whether the error occurred during a editlog write or image write, propogation would be set to let the image or editlog know of the error respectively. Since NNStorage now handles errored storage directories and the notification of both the image and edit log, this propogation is no longer needed. 4. CheckpointSignature.java indentation in the construction of IOException. 7. FSNamesystem.java Indentation in registerBackupNode and releaseBackupNode 8. GetImageServlet.java Indentation What is the standard? Is it codified anywhere? The indentation on these looks fine to me now. There were a few tabs hanging about, but its all 2 space now. 9. TestSaveNamespace.java What is the use of FSImageAccessor? setStorage is being called at couple of places but this class is not being used otherwise. You should never have to set the Storage on the FSImage after it's been created. However it is useful for testing. FSImageAccessor is here to allow you to do that without polluting the production code.
          Hide
          Jitendra Nath Pandey added a comment -

          0. Please also comment on the purpose of StorageListener interface in this jira.
          As I understand from the patch, upgradeManager is set in NNStorage so that NNStorage has no other dependency on FSNamesystem. Is that correct? A comment about this would be helpful.

          1. Storage.java
          deleteDir, rename methods are being accessed non-statically in FSImage.

          2. BackupStorage.java
          -BackupStorage is extending FSImage. We should rename it to something like BackupImage, so as not to confuse it with another kind of storage.
          -indentation is incorrect for NNStorage.NameNodeFile.EDITS_NEW.getName(). It was like this even before this change, but we should fix it with this change.

          3. NNStorage.java

          • Most of the methods in NNStorage are those removed from FSImage. Is that correct? Please comment on the jira if there are any changes to the semantics for some methods moved from FSImage to NNStorage.
          • getImageFile is used to get files of other kinds as well. We could rename this method to something like getStorageFile or just getFile or anything which makes it look like a more generic method.
          • indentation of nested blocks in method iterable.
          • We already have dirIterator in Storage, can we just use that and get rid of iterable method and Iterable interface? Currently, it seems like we have multiple ways of iterating over StorageDirectories.
          • conf field is never read, do we need it? Removing it will save the changes to UpgradeUtilities and a few tests.
          • Why is method close synchronized? It only calls unlockAll which is not synchronized, therefore we are not really preventing race in accessing StorageDirectories.
          • In comments please also make a note about why the method is synchronized if applicable.
          • Can we make checkpointTime private?
          • Set InterfaceAudience to private.
          • Add comments for the class and javadoc for StorageListener interface methods.
          • javadoc for all public methods.
          • LOG.info in attemptRestoreRemovedStorage still refers to FSImage.

          4. CheckpointSignature.java

          • indentation in the construction of IOException.

          5. FSEditLog.java

          • The comment on errorStreams still talks about boolean propogate which was passed in processIOError. Since we are not passing this variable to errorStreams I guess we are propogating it in all cases. How are we handling the case when propogate is false?

          6. FSImage.java

          • In recoverTransitionRead method, storage.checkVersionUpgradable calls static method checkVersionUpgradable on object storage.
          • Several static methods in NNStorage and Storage are being accessed non-statically. We should preferably make them non-static or call them on the class.
          • Indentation in recoverTransitionRead, where IOException is constructed, and in formatOccurred method.

          7. FSNamesystem.java

          • Indentation in registerBackupNode and releaseBackupNode

          8. GetImageServlet.java

          • Indentation

          9. TestSaveNamespace.java

          • What is the use of FSImageAccessor? setStorage is being called at couple of places but this class is not being used otherwise.
          Show
          Jitendra Nath Pandey added a comment - 0. Please also comment on the purpose of StorageListener interface in this jira. As I understand from the patch, upgradeManager is set in NNStorage so that NNStorage has no other dependency on FSNamesystem. Is that correct? A comment about this would be helpful. 1. Storage.java deleteDir, rename methods are being accessed non-statically in FSImage. 2. BackupStorage.java -BackupStorage is extending FSImage. We should rename it to something like BackupImage, so as not to confuse it with another kind of storage. -indentation is incorrect for NNStorage.NameNodeFile.EDITS_NEW.getName(). It was like this even before this change, but we should fix it with this change. 3. NNStorage.java Most of the methods in NNStorage are those removed from FSImage. Is that correct? Please comment on the jira if there are any changes to the semantics for some methods moved from FSImage to NNStorage. getImageFile is used to get files of other kinds as well. We could rename this method to something like getStorageFile or just getFile or anything which makes it look like a more generic method. indentation of nested blocks in method iterable. We already have dirIterator in Storage, can we just use that and get rid of iterable method and Iterable interface? Currently, it seems like we have multiple ways of iterating over StorageDirectories. conf field is never read, do we need it? Removing it will save the changes to UpgradeUtilities and a few tests. Why is method close synchronized? It only calls unlockAll which is not synchronized, therefore we are not really preventing race in accessing StorageDirectories. In comments please also make a note about why the method is synchronized if applicable. Can we make checkpointTime private? Set InterfaceAudience to private. Add comments for the class and javadoc for StorageListener interface methods. javadoc for all public methods. LOG.info in attemptRestoreRemovedStorage still refers to FSImage. 4. CheckpointSignature.java indentation in the construction of IOException. 5. FSEditLog.java The comment on errorStreams still talks about boolean propogate which was passed in processIOError. Since we are not passing this variable to errorStreams I guess we are propogating it in all cases. How are we handling the case when propogate is false? 6. FSImage.java In recoverTransitionRead method, storage.checkVersionUpgradable calls static method checkVersionUpgradable on object storage. Several static methods in NNStorage and Storage are being accessed non-statically. We should preferably make them non-static or call them on the class. Indentation in recoverTransitionRead, where IOException is constructed, and in formatOccurred method. 7. FSNamesystem.java Indentation in registerBackupNode and releaseBackupNode 8. GetImageServlet.java Indentation 9. TestSaveNamespace.java What is the use of FSImageAccessor? setStorage is being called at couple of places but this class is not being used otherwise.
          Hide
          Ivan Kelly added a comment -

          kicking for hudson

          Show
          Ivan Kelly added a comment - kicking for hudson
          Hide
          Ivan Kelly added a comment -

          updated patch to include dhruba's saveNamespace changes

          Show
          Ivan Kelly added a comment - updated patch to include dhruba's saveNamespace changes
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12466895/HDFS-1557-trunk.diff
          against trunk revision 1055142.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 39 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/82//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12466895/HDFS-1557-trunk.diff against trunk revision 1055142. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 39 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/82//console This message is automatically generated.
          Hide
          Ivan Kelly added a comment -

          cancelling patch and resubmitting to make hudson run on it

          Show
          Ivan Kelly added a comment - cancelling patch and resubmitting to make hudson run on it
          Hide
          Flavio Junqueira added a comment -

          Hi there, I was wondering if there is anyone reviewing this patch.

          Show
          Flavio Junqueira added a comment - Hi there, I was wondering if there is anyone reviewing this patch.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12466886/HDFS-1557-trunk.diff
          against trunk revision 1052169.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 39 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 javac. The patch appears to cause tar ant target to fail.

          -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.hdfs.server.namenode.TestSaveNamespace
          org.apache.hadoop.hdfs.server.namenode.TestStorageRestore
          org.apache.hadoop.hdfs.TestFileConcurrentReader

          -1 contrib tests. The patch failed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/42//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/42//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/42//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12466886/HDFS-1557-trunk.diff against trunk revision 1052169. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 39 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The patch appears to cause tar ant target to fail. -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.server.namenode.TestSaveNamespace org.apache.hadoop.hdfs.server.namenode.TestStorageRestore org.apache.hadoop.hdfs.TestFileConcurrentReader -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/42//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/42//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/42//console This message is automatically generated.
          Hide
          Ivan Kelly added a comment -

          This patch is the same as the NNStorage.diff attached to HDFS-1489.

          Show
          Ivan Kelly added a comment - This patch is the same as the NNStorage.diff attached to HDFS-1489 .

            People

            • Assignee:
              Ivan Kelly
              Reporter:
              Ivan Kelly
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development