Details

    • Hadoop Flags:
      Reviewed

      Description

      For HDFS-1073 and other future work, we'd like to have the concept of a transaction ID that is persisted on disk with the image/edits. We already have this concept in the NameNode but it resets to 0 on restart. We can also use this txid to replace the checkpointTime field, I believe.

      1. hdfs-1521.txt
        30 kB
        Todd Lipcon
      2. hdfs-1521.txt
        31 kB
        Todd Lipcon
      3. hdfs-1521.txt
        56 kB
        Todd Lipcon
      4. HDFS-1521.diff
        34 kB
        Ivan Kelly
      5. HDFS-1521.diff
        55 kB
        Ivan Kelly
      6. HDFS-1521.diff
        56 kB
        Ivan Kelly
      7. HDFS-1521.diff
        57 kB
        Ivan Kelly
      8. HDFS-1521.diff
        55 kB
        Ivan Kelly
      9. hdfs-1521.5.txt
        34 kB
        Todd Lipcon
      10. hdfs-1521.4.txt
        45 kB
        Todd Lipcon
      11. hdfs-1521.3.txt
        42 kB
        Todd Lipcon
      12. FSImageFormat.patch
        5 kB
        Konstantin Shvachko

        Issue Links

          Activity

          Hide
          dhruba borthakur added a comment -

          can we instead use the generationStamp that the NN maintains persistently on disk?

          Show
          dhruba borthakur added a comment - can we instead use the generationStamp that the NN maintains persistently on disk?
          Hide
          Todd Lipcon added a comment -

          Dhruba: I think overloading generationStamp is a little messy, because then we add another circular dependency between FSNamesystem and EditLog, which we're trying to eliminate. We already have a "txid" counter in FSEditLog, we just don't write it down anywhere.

          Show
          Todd Lipcon added a comment - Dhruba: I think overloading generationStamp is a little messy, because then we add another circular dependency between FSNamesystem and EditLog, which we're trying to eliminate. We already have a "txid" counter in FSEditLog, we just don't write it down anywhere.
          Hide
          Todd Lipcon added a comment -

          This patch adds the transaction ID to the header of the edit log and image, and restores it when the NN loads a new one.

          Show
          Todd Lipcon added a comment - This patch adds the transaction ID to the header of the edit log and image, and restores it when the NN loads a new one.
          Hide
          Todd Lipcon added a comment -

          [exec]
          [exec] -1 overall.
          [exec]
          [exec] +1 @author. The patch does not contain any @author tags.
          [exec]
          [exec] +1 tests included. The patch appears to include 21 new or modified tests.
          [exec]
          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
          [exec]
          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
          [exec]
          [exec] -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.
          [exec]
          [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
          [exec]
          [exec] +1 system test framework. The patch passed system test framework compile.

          The "1 new Findbugs" is actually from the followup patch of HDFS-1473 which changed the class name of FSImageFormat$Writer to $Saver. Filed as HDFS-1532

          Show
          Todd Lipcon added a comment - [exec] [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 21 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 system test framework. The patch passed system test framework compile. The "1 new Findbugs" is actually from the followup patch of HDFS-1473 which changed the class name of FSImageFormat$Writer to $Saver. Filed as HDFS-1532
          Hide
          Todd Lipcon added a comment -

          Patch fell out of date. Rebased patch.

          Show
          Todd Lipcon added a comment - Patch fell out of date. Rebased patch.
          Hide
          Todd Lipcon added a comment -

          All unit tests except for known problems pass.

          Show
          Todd Lipcon added a comment - All unit tests except for known problems pass.
          Hide
          Sanjay Radia added a comment -

          @Dhruba > can we instead use the generationStamp that the NN maintains persistently on disk?
          Isn;t the txId different - it is increment per transaction. The G# isn't.

          Show
          Sanjay Radia added a comment - @Dhruba > can we instead use the generationStamp that the NN maintains persistently on disk? Isn;t the txId different - it is increment per transaction. The G# isn't.
          Hide
          Sanjay Radia added a comment -

          Todd, from what I can see you are planning to write the txid in the header of the fsImage (makes sense) and the header of the edits.
          Wouldn't it make more sense to write a txid per edits record?

          Show
          Sanjay Radia added a comment - Todd, from what I can see you are planning to write the txid in the header of the fsImage (makes sense) and the header of the edits. Wouldn't it make more sense to write a txid per edits record?
          Hide
          dhruba borthakur added a comment -

          I am fine creating another persistent id here.

          > @Dhruba > can we instead use the generationStamp that the NN maintains persistently on disk?
          > Isn;t the txId different - it is increment per transaction. The G# isn't.

          @Sanjay: I was mentioning that we already have a sequentially increasing persistent id that we store on disk. This is the generation stamp. it is currently incremented only during file creations and error occurances. We could decide to increment it for every transaction too for this JIRA.

          Show
          dhruba borthakur added a comment - I am fine creating another persistent id here. > @Dhruba > can we instead use the generationStamp that the NN maintains persistently on disk? > Isn;t the txId different - it is increment per transaction. The G# isn't. @Sanjay: I was mentioning that we already have a sequentially increasing persistent id that we store on disk. This is the generation stamp. it is currently incremented only during file creations and error occurances. We could decide to increment it for every transaction too for this JIRA.
          Hide
          Todd Lipcon added a comment -

          The reasons I chose to use the txid in FSEditLog rather than coopt generation stamp were:

          • We already had txid in FSEditLog, so it seemed strange to increment this as well as the generation stamp
          • Incrementing FSN's generation stamp would add another cyclic dependency between FSEditLog back to the core of the NN, which we're trying to eliminate in other JIRAs.

          As for deciding to put the txid in the header rather than on every record, I could go either way. I went with just the header because doing it in every record adds 8 bytes per edit, which would probably be 10% or so extra space overhead (likely causing 10% extra time spent loading the image too). I didn't benchmark it, but without any particular benefit, it didn't seem like it was worth the penalty.

          One compromise might be to periodically add a "sync" record which includes the current transaction ID and perhaps some kind of magic number, kind of like what SequenceFile does. This would be handy for repair processes or even for running MR jobs on edit logs some day. Thoughts?

          Show
          Todd Lipcon added a comment - The reasons I chose to use the txid in FSEditLog rather than coopt generation stamp were: We already had txid in FSEditLog, so it seemed strange to increment this as well as the generation stamp Incrementing FSN's generation stamp would add another cyclic dependency between FSEditLog back to the core of the NN, which we're trying to eliminate in other JIRAs. As for deciding to put the txid in the header rather than on every record, I could go either way. I went with just the header because doing it in every record adds 8 bytes per edit, which would probably be 10% or so extra space overhead (likely causing 10% extra time spent loading the image too). I didn't benchmark it, but without any particular benefit, it didn't seem like it was worth the penalty. One compromise might be to periodically add a "sync" record which includes the current transaction ID and perhaps some kind of magic number, kind of like what SequenceFile does. This would be handy for repair processes or even for running MR jobs on edit logs some day. Thoughts?
          Hide
          dhruba borthakur added a comment -

          > periodically add a "sync" record which includes th

          sounds like a good idea, especially helps while debugging a corrupt edits log. how often will this be inserted?

          Show
          dhruba borthakur added a comment - > periodically add a "sync" record which includes th sounds like a good idea, especially helps while debugging a corrupt edits log. how often will this be inserted?
          Hide
          Sanjay Radia added a comment -

          Adding the txid to each record has thae advantage that if we ever get two transactions reversed it will be detected.

          Show
          Sanjay Radia added a comment - Adding the txid to each record has thae advantage that if we ever get two transactions reversed it will be detected.
          Hide
          dhruba borthakur added a comment -

          I would vote for adding the txid to each record. Also, since the image is compressed, the extra disk over head shouls be miniscule, isn't it?

          Show
          dhruba borthakur added a comment - I would vote for adding the txid to each record. Also, since the image is compressed, the extra disk over head shouls be miniscule, isn't it?
          Hide
          Todd Lipcon added a comment -

          Seems the consensus is to put it on every record - happy to do that. Note that the records would go in the edits log which isn't compressed, so we should expect a full extra 8 bytes per record from this change.

          Show
          Todd Lipcon added a comment - Seems the consensus is to put it on every record - happy to do that. Note that the records would go in the edits log which isn't compressed, so we should expect a full extra 8 bytes per record from this change.
          Hide
          dhruba borthakur added a comment -

          > Note that the records would go in the edits log which isn't compres

          That's true! I am still ok with putting the txid on every record.

          Show
          dhruba borthakur added a comment - > Note that the records would go in the edits log which isn't compres That's true! I am still ok with putting the txid on every record.
          Hide
          Todd Lipcon added a comment -

          This patch switches to logging a txid for every edit and verifying strict sequential ordering on load.

          I also left the txid in the header - it seemed to me this is advantageous just as something that must be there at the top of every edit file. If others disagree we can take it out.

          Added some basic tests as well to ensure we can still read the old format.

          Show
          Todd Lipcon added a comment - This patch switches to logging a txid for every edit and verifying strict sequential ordering on load. I also left the txid in the header - it seemed to me this is advantageous just as something that must be there at the top of every edit file. If others disagree we can take it out. Added some basic tests as well to ensure we can still read the old format.
          Hide
          Todd Lipcon added a comment -

          Current patch that I uploaded yesterday passes test-patch and unit tests, though there are one or two pretty trivial TODOs left in the patch, so I need to upload a final one with those addressed. Before I do so, any review comments?

          Show
          Todd Lipcon added a comment - Current patch that I uploaded yesterday passes test-patch and unit tests, though there are one or two pretty trivial TODOs left in the patch, so I need to upload a final one with those addressed. Before I do so, any review comments?
          Hide
          Todd Lipcon added a comment -

          New version addresses the TODOs, removes a couple places I had put unrelated notes to myself, and adds an explicit initialization of BackupStorage#lastAppliedTxId (per review comment by Sanjay offline)

          Will rerun unit tests just to be safe.

          Show
          Todd Lipcon added a comment - New version addresses the TODOs, removes a couple places I had put unrelated notes to myself, and adds an explicit initialization of BackupStorage#lastAppliedTxId (per review comment by Sanjay offline) Will rerun unit tests just to be safe.
          Hide
          Todd Lipcon added a comment -

          Unit tests still pass (except for those known to fail)

          Show
          Todd Lipcon added a comment - Unit tests still pass (except for those known to fail)
          Hide
          Todd Lipcon added a comment -

          test-patch also passes:

          [exec] +1 overall.
          [exec]
          [exec] +1 @author. The patch does not contain any @author tags.
          [exec]
          [exec] +1 tests included. The patch appears to include 21 new or modified tests.
          [exec]
          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
          [exec]
          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
          [exec]
          [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
          [exec]
          [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
          [exec]
          [exec] +1 system test framework. The patch passed system test framework compile.

          Show
          Todd Lipcon added a comment - test-patch also passes: [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 21 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 system test framework. The patch passed system test framework compile.
          Hide
          Konstantin Shvachko added a comment -
          1. FSImage.loadFSEdits(StorageDirectory sd) should return boolean instead of FSEditLogLoader. The boolean means whether the edits should be saved or not. It should be calculated inside loadFSEdits.
          2. You can avoid introducing FSEditLogLoader.needResave by setting expectedStartingTxId before checking that logVersion != FSConstants.LAYOUT_VERSION. Then the old logic of counting this event as an extra transaction will work. I see lots of new members that are used only in few places and can be avoided, which complicates the code.
          3. It would be good if you could replace FSEditLogLoader.expectedStartingTxId member by the respective parameter to loadFSEdits() instead of passing it to the constructor FSEditLogLoader.
          4. I think after that you can also get rid of FSEditLogLoader.numEditsLoaded.
          5. Why don't we write first opCode, then txID, then Writable. There will be less code changes on the loading part. You will still use OP_INVALID to determine the end of file. Eliminates a lot of changes, and extra constant EOF_TXID for -1.
          6. Should we introduce TransactionHeader at this point and write it as Writable. Just something to consider, I did not evaluate the complexity of introducing it.
          7. Need to change JavaDoc for EditLogOutputStream.write(). Missing parameter. Could you also add explanation of the transaction header, which in your patch is just a comment in the loading code.
          8. I don't see any reason to have txID in the beginning of every edits file. You will have it the name, right? So it is redundant. I'd rather remove it. Duplication of information brings the burden of keeping the copies in sync. Also you will not need to carry the parameters inside create() through all the calls.
          9. beginTransaction() instead of startTransaction, as it matches with endTransaction(). I mean start-stop, begin-end.
          10. Don't change rollEditLog() to return long. It is only used in the test. You can getLastWrittenTxId() from editsLog there instead.
          11. It looks to me that FSImage.checkpointTxId is simply currentTxId. If it is, it would be more intuitive. I also don't understand the JavaDoc comment for this member.
          12. BackupStorage.lastAppliedTxId isn't it just checkpointTxId, which is already defined in the base FSImage.

          I hope this will simplify the patch.

          Show
          Konstantin Shvachko added a comment - FSImage.loadFSEdits(StorageDirectory sd) should return boolean instead of FSEditLogLoader. The boolean means whether the edits should be saved or not. It should be calculated inside loadFSEdits. You can avoid introducing FSEditLogLoader.needResave by setting expectedStartingTxId before checking that logVersion != FSConstants.LAYOUT_VERSION . Then the old logic of counting this event as an extra transaction will work. I see lots of new members that are used only in few places and can be avoided, which complicates the code. It would be good if you could replace FSEditLogLoader.expectedStartingTxId member by the respective parameter to loadFSEdits() instead of passing it to the constructor FSEditLogLoader . I think after that you can also get rid of FSEditLogLoader.numEditsLoaded . Why don't we write first opCode, then txID, then Writable. There will be less code changes on the loading part. You will still use OP_INVALID to determine the end of file. Eliminates a lot of changes, and extra constant EOF_TXID for -1. Should we introduce TransactionHeader at this point and write it as Writable. Just something to consider, I did not evaluate the complexity of introducing it. Need to change JavaDoc for EditLogOutputStream.write() . Missing parameter. Could you also add explanation of the transaction header, which in your patch is just a comment in the loading code. I don't see any reason to have txID in the beginning of every edits file. You will have it the name, right? So it is redundant. I'd rather remove it. Duplication of information brings the burden of keeping the copies in sync. Also you will not need to carry the parameters inside create() through all the calls. beginTransaction() instead of startTransaction, as it matches with endTransaction() . I mean start-stop, begin-end. Don't change rollEditLog() to return long. It is only used in the test. You can getLastWrittenTxId() from editsLog there instead. It looks to me that FSImage.checkpointTxId is simply currentTxId . If it is, it would be more intuitive. I also don't understand the JavaDoc comment for this member. BackupStorage.lastAppliedTxId isn't it just checkpointTxId , which is already defined in the base FSImage. I hope this will simplify the patch.
          Hide
          Todd Lipcon added a comment -

          Thanks for the thorough review. Here's a new patch.

          FSImage.loadFSEdits(StorageDirectory sd) should return boolean instead of FSEditLogLoader

          Fixed

          You can avoid introducing FSEditLogLoader.needResave by setting expectedStartingTxId before checking that logVersion != FSConstants.LAYOUT_VERSION. Then the old logic of counting this event as an extra transaction will work

          I found the former logic here to be very confusing and somewhat of a hack. It's also important that the loader returns the correct number of edits rather than potentially returning 1 when there are 0 edits. If it did that, it would break many cases by potentially causing a skip in transaction IDs. Though the new code adds a new member, the new member has a clear purpose and I think it's easier to understand from the caller's perspective, especially now that your point #1 above is addressed.

          It would be good if you could replace FSEditLogLoader.expectedStartingTxId member by the respective parameter to loadFSEdits

          I think after that you can also get rid of FSEditLogLoader.numEditsLoaded.

          Fixed

          Why don't we write first opCode, then txID, then Writable. There will be less code changes on the loading part

          Very good call! This indeed cleaned up the loading code a lot.

          Should we introduce TransactionHeader at this point and write it as Writable. Just something to consider

          I think given that the header is still pretty simple it's not worth it at this point.

          Need to change JavaDoc for EditLogOutputStream.write(). Missing parameter

          Fixed

          I don't see any reason to have txID in the beginning of every edits file. You will have it the name, right

          beginTransaction() instead of startTransaction, as it matches with endTransaction()

          Fixed.

          Don't change rollEditLog() to return long. It is only used in the test

          It's necessary that the transaction ID be returned inside the same synchronization block. If we used a separate call to getLastWrittenTxId() then another txid could have been written in between (note that the test is multithreaded).

          It looks to me that FSImage.checkpointTxId is simply currentTxId. If it is, it would be more intuitive

          It's not really current - it's the txid of the image file, not including any edits that have been written to the edit log - sort of like how checkpointTime is set only when an image is saved. Naming it "currentTxId" would imply that it is updated on every edit.

          BackupStorage.lastAppliedTxId isn't it just checkpointTxId, which is already defined in the base FSImage.

          Contrary to above, lastAppliedTxId refers to the transaction ID that has been applied to the namespace. This is always >= checkpointTxId - checkpointTxId only changes when the BN saves an image, but lastAppliedTxId changes every time some edits are applied via RPC.

          I'll run the new patch through the unit test suite one more time.

          Show
          Todd Lipcon added a comment - Thanks for the thorough review. Here's a new patch. FSImage.loadFSEdits(StorageDirectory sd) should return boolean instead of FSEditLogLoader Fixed You can avoid introducing FSEditLogLoader.needResave by setting expectedStartingTxId before checking that logVersion != FSConstants.LAYOUT_VERSION. Then the old logic of counting this event as an extra transaction will work I found the former logic here to be very confusing and somewhat of a hack. It's also important that the loader returns the correct number of edits rather than potentially returning 1 when there are 0 edits. If it did that, it would break many cases by potentially causing a skip in transaction IDs. Though the new code adds a new member, the new member has a clear purpose and I think it's easier to understand from the caller's perspective, especially now that your point #1 above is addressed. It would be good if you could replace FSEditLogLoader.expectedStartingTxId member by the respective parameter to loadFSEdits I think after that you can also get rid of FSEditLogLoader.numEditsLoaded. Fixed Why don't we write first opCode, then txID, then Writable. There will be less code changes on the loading part Very good call! This indeed cleaned up the loading code a lot. Should we introduce TransactionHeader at this point and write it as Writable. Just something to consider I think given that the header is still pretty simple it's not worth it at this point. Need to change JavaDoc for EditLogOutputStream.write(). Missing parameter Fixed I don't see any reason to have txID in the beginning of every edits file. You will have it the name, right beginTransaction() instead of startTransaction, as it matches with endTransaction() Fixed. Don't change rollEditLog() to return long. It is only used in the test It's necessary that the transaction ID be returned inside the same synchronization block. If we used a separate call to getLastWrittenTxId() then another txid could have been written in between (note that the test is multithreaded). It looks to me that FSImage.checkpointTxId is simply currentTxId. If it is, it would be more intuitive It's not really current - it's the txid of the image file, not including any edits that have been written to the edit log - sort of like how checkpointTime is set only when an image is saved. Naming it "currentTxId" would imply that it is updated on every edit. BackupStorage.lastAppliedTxId isn't it just checkpointTxId, which is already defined in the base FSImage. Contrary to above, lastAppliedTxId refers to the transaction ID that has been applied to the namespace. This is always >= checkpointTxId - checkpointTxId only changes when the BN saves an image, but lastAppliedTxId changes every time some edits are applied via RPC. I'll run the new patch through the unit test suite one more time.
          Hide
          Todd Lipcon added a comment -

          Tests passed on hdfs-1521.5.txt except for known failures

          Show
          Todd Lipcon added a comment - Tests passed on hdfs-1521.5.txt except for known failures
          Hide
          Konstantin Shvachko added a comment -

          2. I started thinking why do we need FSEditLogLoader.needResave. First of all it is only one of the conditions that requires resave. Second, it is relevant only if the edits is not empty, otherwise it is going to be saved based on the transaction count. So the name should be isOlderVersion instead of needResave, which led me to the idea that we don't need to read edits to know it's an old version as we already know that after reading VERSION file even before we loaded fsimage.
          So we can remove FSEditLogLoader.needResave along with all the logic related to it from EditsLog, and move this logic into FSImage.loadFSImage(File)

            boolean loadFSImage(File curFile) throws IOException {
              ..............
              boolean needToSave = this.layoutVersion != FSConstants.LAYOUT_VERSION;
              return needToSave;
            }
          

          It should this.layoutVersion not the loaded one as described below.

          Then I noticed that we read layoutVersion and namespaceId from fsimage file and replace with them values obtained from the VERSION file. We shouldn't be doing that. This is not introduced in your patch. It has been there for a while. But we should have fixed it during refactoring. I should have looked at the TODO comment that I myself wrote there. I think we need fix it now.
          So the idea is that we should never rely on the values for namespaceId and layoutVersion read from fsimage. We should read them for backward compatibility, but then completely ignore them. Therefore we will not need those 2 fields
          FSImageFormat.Loader.imgVersion
          FSImageFormat.Loader.imgNamespaceID
          and related getters, as we will not need to assign those values in FSImage.
          If you think it's a different issue I can file one. But needResave does not need to be introduced here based on the above.

          Finally, this gets me to the question if FSImage.checkpointTxId as you describe it should be written to fsimage or to the VERSION file. Feels like VERSION is the right place. For image directories it is the latest txId committed to this image. For edits-only directories it is the expectedStartingTxId - 1.

          10. Yes you can surround a call to rollEditLog() and getLastWrittenTxId() with any synchronized sections you need.

          11. I do not understand how lastAppliedTxId is initialized teh first time when BN starts up. May be BN should not count txIds but rather receive startTxId from NN.

          Show
          Konstantin Shvachko added a comment - 2. I started thinking why do we need FSEditLogLoader.needResave . First of all it is only one of the conditions that requires resave. Second, it is relevant only if the edits is not empty, otherwise it is going to be saved based on the transaction count. So the name should be isOlderVersion instead of needResave , which led me to the idea that we don't need to read edits to know it's an old version as we already know that after reading VERSION file even before we loaded fsimage. So we can remove FSEditLogLoader.needResave along with all the logic related to it from EditsLog, and move this logic into FSImage.loadFSImage(File) boolean loadFSImage(File curFile) throws IOException { .............. boolean needToSave = this .layoutVersion != FSConstants.LAYOUT_VERSION; return needToSave; } It should this.layoutVersion not the loaded one as described below. Then I noticed that we read layoutVersion and namespaceId from fsimage file and replace with them values obtained from the VERSION file. We shouldn't be doing that. This is not introduced in your patch. It has been there for a while. But we should have fixed it during refactoring. I should have looked at the TODO comment that I myself wrote there. I think we need fix it now. So the idea is that we should never rely on the values for namespaceId and layoutVersion read from fsimage. We should read them for backward compatibility, but then completely ignore them. Therefore we will not need those 2 fields FSImageFormat.Loader.imgVersion FSImageFormat.Loader.imgNamespaceID and related getters, as we will not need to assign those values in FSImage. If you think it's a different issue I can file one. But needResave does not need to be introduced here based on the above. Finally, this gets me to the question if FSImage.checkpointTxId as you describe it should be written to fsimage or to the VERSION file. Feels like VERSION is the right place. For image directories it is the latest txId committed to this image. For edits-only directories it is the expectedStartingTxId - 1 . 10. Yes you can surround a call to rollEditLog() and getLastWrittenTxId() with any synchronized sections you need. 11. I do not understand how lastAppliedTxId is initialized teh first time when BN starts up. May be BN should not count txIds but rather receive startTxId from NN.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12466415/hdfs-1521.5.txt
          against trunk revision 1051644.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 12 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.cli.TestHDFSCLI
          org.apache.hadoop.hdfs.server.namenode.TestStorageRestore
          org.apache.hadoop.hdfs.TestHDFSTrash

          -1 contrib tests. The patch failed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/13//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/13//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/13//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12466415/hdfs-1521.5.txt against trunk revision 1051644. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 12 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestHDFSCLI org.apache.hadoop.hdfs.server.namenode.TestStorageRestore org.apache.hadoop.hdfs.TestHDFSTrash -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/13//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/13//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/13//console This message is automatically generated.
          Hide
          Ivan Kelly added a comment -

          Has this feature been put on ice, or is it waiting for some other JIRA to go in?

          Show
          Ivan Kelly added a comment - Has this feature been put on ice, or is it waiting for some other JIRA to go in?
          Hide
          Todd Lipcon added a comment -

          Hey Ivan. Sorry, it's been "on ice" but mostly just due to other commitments getting in the way for me. I need to circle back on Konstantin's last pieces of feedback - will try to prioritize it for this weekend.

          Show
          Todd Lipcon added a comment - Hey Ivan. Sorry, it's been "on ice" but mostly just due to other commitments getting in the way for me. I need to circle back on Konstantin's last pieces of feedback - will try to prioritize it for this weekend.
          Hide
          Ivan Kelly added a comment -

          Updated patch to apply on trunk.

          Show
          Ivan Kelly added a comment - Updated patch to apply on trunk.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12471094/HDFS-1521.diff
          against trunk revision 1070025.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 12 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer

          -1 contrib tests. The patch failed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/162//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/162//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/162//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12471094/HDFS-1521.diff against trunk revision 1070025. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 12 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/162//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/162//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/162//console This message is automatically generated.
          Hide
          Ivan Kelly added a comment -

          This patch addresses all Konstantin's comments except 11. There is something strange going on with lastAppliedTxid, that TestBackupNode isn't currently picking up. At line 177 change it to the following

                //
                // Take a checkpoint
                //
                backup = startBackupNode(conf, op, 1);
                waitCheckpointDone(backup);
                
                for (int i = 0; i < 10; i++) {
                  writeFile(fileSys, new Path("file_" + i), replication);
                }
                
                backup.doCheckpoint();
                waitCheckpointDone(backup);
          
          

          This will trigger the test to fail. The normal run of the test doesn't exercise convergeJournalSpool, so usually you don't see this.

          So, now you'll see that if BackupNode loads a checkpoint, and then tries to journal something, the lastAppliedTxid + 1 will be 1 even though we've loaded in an image and editlog. The simple fix is to put

                lastAppliedTxId = getEditLog().getLastWrittenTxId();
          

          in loadCheckpoint(). This should be the end of the story.

          However, with this change, you get the error

          java.io.IOException: Expected transaction ID 10 but got 11

          A transaction is going missing. Whats happening is, when doCheckpoint get kicked off, the log is rolled, and logJSpoolStart is called which creates an edit with opcode OP_JSPOOL_START. This opcode, is caught by EditLogBackupOutputStream and never transmitted to the backup node, so the transaction ids on the Primary and the Backup get out of sync.

          So, the question here is, is there any harm is actually transferring these OP_JSPOOL_START transactions, or are they just excluded as a precaution?

          Show
          Ivan Kelly added a comment - This patch addresses all Konstantin's comments except 11. There is something strange going on with lastAppliedTxid, that TestBackupNode isn't currently picking up. At line 177 change it to the following // // Take a checkpoint // backup = startBackupNode(conf, op, 1); waitCheckpointDone(backup); for ( int i = 0; i < 10; i++) { writeFile(fileSys, new Path( "file_" + i), replication); } backup.doCheckpoint(); waitCheckpointDone(backup); This will trigger the test to fail. The normal run of the test doesn't exercise convergeJournalSpool, so usually you don't see this. So, now you'll see that if BackupNode loads a checkpoint, and then tries to journal something, the lastAppliedTxid + 1 will be 1 even though we've loaded in an image and editlog. The simple fix is to put lastAppliedTxId = getEditLog().getLastWrittenTxId(); in loadCheckpoint(). This should be the end of the story. However, with this change, you get the error java.io.IOException: Expected transaction ID 10 but got 11 A transaction is going missing. Whats happening is, when doCheckpoint get kicked off, the log is rolled, and logJSpoolStart is called which creates an edit with opcode OP_JSPOOL_START. This opcode, is caught by EditLogBackupOutputStream and never transmitted to the backup node, so the transaction ids on the Primary and the Backup get out of sync. So, the question here is, is there any harm is actually transferring these OP_JSPOOL_START transactions, or are they just excluded as a precaution?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12471206/HDFS-1521.diff
          against trunk revision 1071023.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 12 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/168//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/168//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/168//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12471206/HDFS-1521.diff against trunk revision 1071023. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 12 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/168//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/168//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/168//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12471206/HDFS-1521.diff
          against trunk revision 1072023.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 12 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.hdfs.server.datanode.TestBlockReport
          org.apache.hadoop.hdfs.TestFileAppend2

          -1 contrib tests. The patch failed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/181//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/181//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/181//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12471206/HDFS-1521.diff against trunk revision 1072023. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 12 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.server.datanode.TestBlockReport org.apache.hadoop.hdfs.TestFileAppend2 -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/181//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/181//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/181//console This message is automatically generated.
          Hide
          Ivan Kelly added a comment -

          Solved control op problem by simply not beginning a transaction if opcode > JSPOOL_START opcode.

          I think this is ready for submission now, unless there are more comments.

          Show
          Ivan Kelly added a comment - Solved control op problem by simply not beginning a transaction if opcode > JSPOOL_START opcode. I think this is ready for submission now, unless there are more comments.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12472564/HDFS-1521.diff
          against trunk revision 1076024.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 15 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.hdfs.TestFileConcurrentReader

          -1 contrib tests. The patch failed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/228//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/228//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/228//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12472564/HDFS-1521.diff against trunk revision 1076024. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestFileConcurrentReader -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/228//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/228//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/228//console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          I took a swing through Ivan's latest patch and looks good to me. I fixed up one javadoc error and added a new comment in one bit of code that needed some clarification. Aside from that, this patch is identical to Ivan's.

          +1 from me - Konstantin, would you mind taking one last look before we commit this?

          Show
          Todd Lipcon added a comment - I took a swing through Ivan's latest patch and looks good to me. I fixed up one javadoc error and added a new comment in one bit of code that needed some clarification. Aside from that, this patch is identical to Ivan's. +1 from me - Konstantin, would you mind taking one last look before we commit this?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12472761/hdfs-1521.txt
          against trunk revision 1076696.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 15 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.hdfs.server.datanode.TestBlockReport

          -1 contrib tests. The patch failed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/233//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/233//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/233//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12472761/hdfs-1521.txt against trunk revision 1076696. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.server.datanode.TestBlockReport -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/233//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/233//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/233//console This message is automatically generated.
          Hide
          Konstantin Shvachko added a comment -

          Attaching a diff for FSImageFormat only. Let's make sure imgVersion in the image file and layoutVersion in VERSION file are the same and then use the latter. This lets us avoid passing imgVersion to several methods, and clarifies that imgVersion in the image is nothing but a redundant field.

          The rest looks good. +1

          Show
          Konstantin Shvachko added a comment - Attaching a diff for FSImageFormat only. Let's make sure imgVersion in the image file and layoutVersion in VERSION file are the same and then use the latter. This lets us avoid passing imgVersion to several methods, and clarifies that imgVersion in the image is nothing but a redundant field. The rest looks good. +1
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12473532/FSImageFormat.patch
          against trunk revision 1080836.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 javac. The patch appears to cause tar ant target to fail.

          -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:

          -1 contrib tests. The patch failed contrib unit tests.

          -1 system test framework. The patch failed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/254//testReport/
          Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/254//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12473532/FSImageFormat.patch against trunk revision 1080836. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The patch appears to cause tar ant target to fail. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: -1 contrib tests. The patch failed contrib unit tests. -1 system test framework. The patch failed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/254//testReport/ Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/254//console This message is automatically generated.
          Hide
          Ivan Kelly added a comment -

          Merged shv's changes with the main patch. One minor change, in that storage is now passed to FSImageFormat directly so that getImageVersion doesn't have to reach back around namesystem.getImage().getStorage to get the version.

          Show
          Ivan Kelly added a comment - Merged shv's changes with the main patch. One minor change, in that storage is now passed to FSImageFormat directly so that getImageVersion doesn't have to reach back around namesystem.getImage().getStorage to get the version.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12473549/HDFS-1521.diff
          against trunk revision 1080836.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 15 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.hdfs.TestFileAppend4
          org.apache.hadoop.hdfs.TestLargeBlock
          org.apache.hadoop.hdfs.TestWriteConfigurationToDFS

          -1 contrib tests. The patch failed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/256//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/256//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/256//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12473549/HDFS-1521.diff against trunk revision 1080836. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestFileAppend4 org.apache.hadoop.hdfs.TestLargeBlock org.apache.hadoop.hdfs.TestWriteConfigurationToDFS -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/256//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/256//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/256//console This message is automatically generated.
          Hide
          Konstantin Shvachko added a comment -

          Ivan, you are reaching back one way or another.
          namesystem.getImage().getStorage() points exactly where the version is.
          I don't see any benefits in introducing another nnStorage variable. Do you?
          It just obscures which layout version you are trying to get, and confuses
          people there could be some other NNStorage that is different from the image one.
          I would do this only if you could eliminate the reference to namesystem.
          But it doesn't look it is possible.

          Show
          Konstantin Shvachko added a comment - Ivan, you are reaching back one way or another. namesystem.getImage().getStorage() points exactly where the version is. I don't see any benefits in introducing another nnStorage variable. Do you? It just obscures which layout version you are trying to get, and confuses people there could be some other NNStorage that is different from the image one. I would do this only if you could eliminate the reference to namesystem. But it doesn't look it is possible.
          Hide
          Ivan Kelly added a comment -

          I've submitted a version without the storage change.

          The difference in "reaching back" with my former version was that FSImageFormat did not have to know about the existence of the method getImage() on FSNameSystem, and of the method getStorage() on FSImage. So it breaks a static code dependency. Im don't feel very strongly about it either way though, so I've submitted a patch without it.

          I've looked at breaking the dependency on FSNameSystem before, and it does look to be possible, but would be a lot of work. Really FSImageFormat should only need the FSDirectory reference.

          Show
          Ivan Kelly added a comment - I've submitted a version without the storage change. The difference in "reaching back" with my former version was that FSImageFormat did not have to know about the existence of the method getImage() on FSNameSystem, and of the method getStorage() on FSImage. So it breaks a static code dependency. Im don't feel very strongly about it either way though, so I've submitted a patch without it. I've looked at breaking the dependency on FSNameSystem before, and it does look to be possible, but would be a lot of work. Really FSImageFormat should only need the FSDirectory reference.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12473838/HDFS-1521.diff
          against trunk revision 1082263.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 15 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/262//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/262//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/262//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12473838/HDFS-1521.diff against trunk revision 1082263. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/262//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/262//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/262//console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          Hey Konstantin, are you +1 on this latest patch still?

          Show
          Todd Lipcon added a comment - Hey Konstantin, are you +1 on this latest patch still?
          Hide
          Konstantin Shvachko added a comment -

          +1. Looks good to me.
          Agreed breaking the dependency on FSNamesystem should be the next stage.

          Show
          Konstantin Shvachko added a comment - +1. Looks good to me. Agreed breaking the dependency on FSNamesystem should be the next stage.
          Hide
          Todd Lipcon added a comment -

          Committed this to trunk. Thanks Ivan for the finishing touches and Konstantin for the several reviews.

          Show
          Todd Lipcon added a comment - Committed this to trunk. Thanks Ivan for the finishing touches and Konstantin for the several reviews.
          Hide
          Todd Lipcon added a comment -

          For some reason this is causing TestBackupNode to fail - surprised we didn't see this in the Hudson QA run.

          Reverting and reopening to address.

          Show
          Todd Lipcon added a comment - For some reason this is causing TestBackupNode to fail - surprised we didn't see this in the Hudson QA run. Reverting and reopening to address.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #569 (See https://hudson.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/569/)
          Revert HDFS-1521 because of failing TestBackupNode

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #569 (See https://hudson.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/569/ ) Revert HDFS-1521 because of failing TestBackupNode
          Hide
          Todd Lipcon added a comment -

          The error is:

          2011-03-21 14:07:49,247 ERROR namenode.TestBackupNode (TestBackupNode.java:testCheckpoint(190)) - Error in TestBackupNode:
          java.io.IOException: Expected transaction ID 40 but got 10
          at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:146)
          at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:96)
          at org.apache.hadoop.hdfs.server.namenode.BackupImage.convergeJournalSpool(BackupImage.java:355)
          at org.apache.hadoop.hdfs.server.namenode.Checkpointer.doCheckpoint(Checkpointer.java:260)
          at org.apache.hadoop.hdfs.server.namenode.BackupNode.doCheckpoint(BackupNode.java:279)
          at org.apache.hadoop.hdfs.server.namenode.TestBackupNode.testCheckpoint(TestBackupNode.java:186)
          at org.apache.hadoop.hdfs.server.namenode.TestBackupNode.testCheckpoint(TestBackupNode.java:106)

          Show
          Todd Lipcon added a comment - The error is: 2011-03-21 14:07:49,247 ERROR namenode.TestBackupNode (TestBackupNode.java:testCheckpoint(190)) - Error in TestBackupNode: java.io.IOException: Expected transaction ID 40 but got 10 at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:146) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:96) at org.apache.hadoop.hdfs.server.namenode.BackupImage.convergeJournalSpool(BackupImage.java:355) at org.apache.hadoop.hdfs.server.namenode.Checkpointer.doCheckpoint(Checkpointer.java:260) at org.apache.hadoop.hdfs.server.namenode.BackupNode.doCheckpoint(BackupNode.java:279) at org.apache.hadoop.hdfs.server.namenode.TestBackupNode.testCheckpoint(TestBackupNode.java:186) at org.apache.hadoop.hdfs.server.namenode.TestBackupNode.testCheckpoint(TestBackupNode.java:106)
          Hide
          Ivan Kelly added a comment -

          Something in HDFS-1763 is causing the new failure. Still tracking it down.

          Show
          Ivan Kelly added a comment - Something in HDFS-1763 is causing the new failure. Still tracking it down.
          Hide
          Ivan Kelly added a comment -

          I think this BackupNode failure is the result of a preexisting problem with BackupNode. The problem is the timing with which OP_JSPOOL_START is arriving at the backup node. The sequence is:

            NameNode BackupNode
          1.   doCheckpoint
          2. startCheckpoint  
          3. log(OP_JSPOOL_START)  
          4.   download images
          5.   merge
          6.   upload new image
          7.   convergeJournalSpool
          8. flush editlog buffers  
          9.   startJournalSpool
               
          ... ... ...
               
          10.   doCheckpoint
          11. startCheckpoint  
          12. log(OP_JSPOOL_START)  
          13.   download images
          14.   merge
          15.   upload new image
          16.   convergeJournalSpool (EXCEPTION)

          Basically, the OP_JSPOOL_START doesn't reach BackupNode before the checkpoint finishes, so when it does arrive, a spool is created which is then converged during the next checkpoint, but it contains all the transactions from the first checkpoint onwards.

          Show
          Ivan Kelly added a comment - I think this BackupNode failure is the result of a preexisting problem with BackupNode. The problem is the timing with which OP_JSPOOL_START is arriving at the backup node. The sequence is:   NameNode BackupNode 1.   doCheckpoint 2. startCheckpoint   3. log(OP_JSPOOL_START)   4.   download images 5.   merge 6.   upload new image 7.   convergeJournalSpool 8. flush editlog buffers   9.   startJournalSpool       ... ... ...       10.   doCheckpoint 11. startCheckpoint   12. log(OP_JSPOOL_START)   13.   download images 14.   merge 15.   upload new image 16.   convergeJournalSpool (EXCEPTION) Basically, the OP_JSPOOL_START doesn't reach BackupNode before the checkpoint finishes, so when it does arrive, a spool is created which is then converged during the next checkpoint, but it contains all the transactions from the first checkpoint onwards.
          Hide
          Todd Lipcon added a comment -

          Committed to the HDFS-1073 branch. We will address the BackupNode issues mentioned by Ivan later on in this branch before merge into trunk.

          Show
          Todd Lipcon added a comment - Committed to the HDFS-1073 branch. We will address the BackupNode issues mentioned by Ivan later on in this branch before merge into trunk.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #643 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/)

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #643 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/ )

            People

            • Assignee:
              Todd Lipcon
              Reporter:
              Todd Lipcon
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development