Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1073

Simpler model for Namenode's fs Image and edit Logs

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.23.0
    • Fix Version/s: 0.23.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change, Reviewed
    • Release Note:
      Hide
      The NameNode's storage layout for its name directories has been reorganized to be more robust. Each edit now has a unique transaction ID, and each file is associated with a transaction ID (for checkpoints) or a range of transaction IDs (for edit logs).
      Show
      The NameNode's storage layout for its name directories has been reorganized to be more robust. Each edit now has a unique transaction ID, and each file is associated with a transaction ID (for checkpoints) or a range of transaction IDs (for edit logs).

      Description

      The naming and handling of NN's fsImage and edit logs can be significantly improved resulting simpler and more robust code.

      1. ASF.LICENSE.NOT.GRANTED--hdfs1073.pdf
        88 kB
        Todd Lipcon
      2. hdfs1073.pdf
        189 kB
        Todd Lipcon
      3. hdfs1073.pdf
        159 kB
        Todd Lipcon
      4. hdfs1073.tex
        27 kB
        Todd Lipcon
      5. hdfs-1073.txt
        207 kB
        Todd Lipcon
      6. hdfs-1073-editloading-algos.txt
        37 kB
        Todd Lipcon
      7. hdfs-1073-merge.patch
        734 kB
        Todd Lipcon
      8. hdfs-1073-merge.patch
        740 kB
        Todd Lipcon
      9. hdfs-1073-merge.patch
        738 kB
        Todd Lipcon

        Issue Links

        1.
        Refactor edit log loading to a separate class from edit log writing Sub-task Closed Todd Lipcon
         
        2.
        Refactor storage management into separate classes than fsimage file reading/writing Sub-task Closed Todd Lipcon
         
        3.
        Remove intentionally corrupt 0.13 directory layout creation Sub-task Closed Todd Lipcon
         
        4.
        Persist transaction ID on disk between NN restarts Sub-task Resolved Todd Lipcon
         
        5.
        Refactor more startup and image loading code out of FSImage Sub-task Resolved Todd Lipcon
         
        6.
        Add code to detect valid length of an edits file Sub-task Resolved Todd Lipcon
         
        7.
        Add code to inspect a storage directory with txid-based filenames Sub-task Resolved Todd Lipcon
         
        8.
        Add code to list which edit logs are available on a remote NN Sub-task Resolved Todd Lipcon
         
        9.
        Refactor log rolling and filename management out of FSEditLog Sub-task Resolved Todd Lipcon
         
        10.
        reduce need to rewrite fsimage on statrtup Sub-task Resolved Todd Lipcon
         
        11.
        Extend image checksumming to function with multiple fsimage files Sub-task Resolved Todd Lipcon
         
        12.
        Remove use of timestamps to identify checkpoints and logs Sub-task Resolved Todd Lipcon
         
        13.
        Add migration tests from old-format to new-format storage Sub-task Resolved Unassigned
         
        14.
        Add state management variables to FSEditLog Sub-task Resolved Todd Lipcon
         
        15.
        Add some convenience functions to iterate over edit log streams Sub-task Resolved Todd Lipcon
         
        16.
        Update HDFS-1073 branch to deal with OP_INVALID-filled preallocation Sub-task Resolved Todd Lipcon
         
        17.
        Change edit logs and images to be named based on txid Sub-task Resolved Todd Lipcon
         
        18.
        Add constants for LAYOUT_VERSIONs in edits log branch Sub-task Resolved Todd Lipcon
         
        19.
        Additional QA tasks for Edit Log branch Sub-task Resolved Todd Lipcon
         
        20.
        Remove references to StorageDirectory from JournalManager interface Sub-task Resolved Ivan Kelly
         
        21.
        TestDFSUpgrade failing in HDFS-1073 branch Sub-task Resolved Todd Lipcon
         
        22.
        HDFS-1073: Fix backupnode for new edits/image layout Sub-task Resolved Todd Lipcon
         
        23.
        1073: Enable multiple checkpointers to run simultaneously Sub-task Resolved Todd Lipcon
         
        24.
        HDFS-1073: Cleanup in image transfer servlet Sub-task Resolved Todd Lipcon
         
        25.
        HDFS-1073: Test for 2NN downloading image is not running Sub-task Resolved Todd Lipcon
         
        26.
        HDFS-1073: Some refactoring of 2NN to easier share code with BN and CN Sub-task Resolved Todd Lipcon
         
        27.
        Remove vestiges of NNStorageListener Sub-task Resolved Todd Lipcon
         
        28.
        TestCheckpoint needs to clean up between cases Sub-task Resolved Todd Lipcon
         
        29.
        Fix race conditions when running two rapidly checkpointing 2NNs Sub-task Resolved Todd Lipcon
         
        30.
        Image transfer process misreports client side exceptions Sub-task Resolved Todd Lipcon
         
        31.
        HDFS-1073: Kill previous.checkpoint, lastcheckpoint.tmp directories Sub-task Resolved Todd Lipcon
         
        32.
        Clean up and test behavior under failed edit streams Sub-task Resolved Aaron T. Myers
         
        33.
        1073: Remove checkpointTxId from VERSION file Sub-task Resolved Todd Lipcon
         
        34.
        1073: remove/archive unneeded/old storage files Sub-task Resolved Todd Lipcon
         
        35.
        1073: 2NN needs to handle case of reformatted NN better Sub-task Resolved Todd Lipcon
         
        36.
        1073: Image inspector should return finalized logs before unfinalized logs Sub-task Resolved Todd Lipcon
         
        37.
        1073: Improve TestNamespace and TestEditLog in 1073 branch Sub-task Resolved Todd Lipcon
         
        38.
        1073: Improve upgrade tests from 0.22 Sub-task Resolved Todd Lipcon
         
        39.
        1073: determine edit log validity by truly reading and validating transactions Sub-task Resolved Todd Lipcon
         
        40.
        1073: address checkpoint upload when one of the storage dirs is failed Sub-task Resolved Todd Lipcon
         
        41.
        1073: NN should not clear storage directory when restoring removed storage Sub-task Resolved Todd Lipcon
         
        42.
        1073: create an escape hatch to ignore startup consistency problems Sub-task Resolved Colin Patrick McCabe
         
        43.
        1073: finalize inprogress edit logs at startup Sub-task Resolved Todd Lipcon
         
        44.
        1073: Move edits log archiving logic into FSEditLog/JournalManager Sub-task Resolved Todd Lipcon
         
        45.
        1073: Handle case where an entirely empty log is left during NN crash Sub-task Resolved Todd Lipcon
         
        46. 1073: consider adding END_LOG_SEGMENT txn when finalizing inprogress logs at startup Sub-task Open Unassigned
         
        47.
        1073: update remaining unit tests to new storage filenames Sub-task Resolved Todd Lipcon
         
        48.
        1073: Add a flag to 2NN to format its checkpoint dirs on startup Sub-task Resolved Todd Lipcon
         
        49.
        1073: Checkpoint interval should be based on txn count, not size Sub-task Resolved Todd Lipcon
         
        50.
        1073: address remaining TODOs and pre-merge cleanup Sub-task Resolved Todd Lipcon
         
        51.
        1073: fix regression of HDFS-1955 in branch Sub-task Resolved Todd Lipcon
         
        52. 1073: Fault injection for StorageDirectory failures during read/write of FSImage/Edits files Sub-task Open Unassigned
         
        53.
        1073: Zero pad edits filename to make them lexically sortable Sub-task Resolved Ivan Kelly
         
        54.
        1073: Move all journal stream management code into one place Sub-task Resolved Ivan Kelly
         
        55.
        1073: fix CreateEditsLog test tool in branch Sub-task Resolved Todd Lipcon
         
        56.
        1073: Reenable TestEditLog.testFailedOpen and fix exposed bug Sub-task Resolved Todd Lipcon
         
        57.
        1073: clean up TestCheckpoint and remove TODOs Sub-task Resolved Todd Lipcon
         
        58.
        1073: Address remaining TODOs Sub-task Resolved Todd Lipcon
         
        59.
        1073: address findbugs/javadoc warnings Sub-task Resolved Todd Lipcon
         
        60. saveNamespace should not throw IOE when only one storage directory fails to write VERSION file Sub-task Open Unassigned
         
        61. Complete decoupling of failure states between edits and image dirs Sub-task Open Unassigned
         

          Activity

          Hide
          Todd Lipcon added a comment -

          I agree. I put one proposal that I'm partial to here: https://issues.apache.org/jira/browse/HDFS-955?focusedCommentId=12832706&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12832706

          The "#3" solution in the above comment I think is (a) reasonably simple, and (b) can be very easily made to keep around some number of trailing image/edit files in case of corruption bugs.

          Is this the sort of thing you're investigating for this JIRA, Sanjay?

          Show
          Todd Lipcon added a comment - I agree. I put one proposal that I'm partial to here: https://issues.apache.org/jira/browse/HDFS-955?focusedCommentId=12832706&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12832706 The "#3" solution in the above comment I think is (a) reasonably simple, and (b) can be very easily made to keep around some number of trailing image/edit files in case of corruption bugs. Is this the sort of thing you're investigating for this JIRA, Sanjay?
          Hide
          Sanjay Radia added a comment -

          This Jira proposes a simpler design for for managing fsimage and edit logs.
          The edit logs and fsimage scheme in current Hadoop requires coordination and can lead to tricky bugs (HDFS-955).

          Proposed design

          1. All transactions have a transaction ID. A transaction ID is a number that starts at zero and incremented. Each journal record of the editLogs file has the transaction ID.
          2. Fsimage file is identified by the transaction ID of the last checkpointed transaction in the file.
            • E.g. fsImage_<transactionIDofLastTranscationChekpointed>
          3. An editsLog file is identified by the transaction ID of the first recorded transaction in the file.
            • E.g. fsEditlogs<transactionIdofFirstTransaction>
          4. To start the name server,
            • Load the fsImage with the greatest transactionID N. If no image exist, take N to be 0.
            • Process all transactions >N from the editsLog: Find an editsLog that includes transaction with IDs N+1. Process all transactions >= N+1 from that and all subsequent editLogs files.

          Salient points

          • This scheme does not require any synchronization between when fsImages are checkpointed and editsLogs files are split (although it is convenient if when you checkpoint at transactionID N, then you also spilt your edits logs at N or slightly less).
          • This means that the NameNode and BackupNode can share images and edits without coordination. (This is very different from the current design.). For example the primary NN can decided that it wants a checkpoint and hence split the editLogs and ask the backup NN to do a checkpoint; the checkpoint operation can succeed or fail without worries. (Btw if the split of the editLogs is recorded as the last transaction in the edit logs then the backup NN will see that transaction come across and realize that this is convenient time to checkpoint.
          • The scheme does not require coordination between checkpointing fsImages themselves! For example, while the backup NN is doing a checkpoint, the NN could be asked to do a saveImage by the admin.
          • Policies on how many edits and fsimages to keep is separable.
          Show
          Sanjay Radia added a comment - This Jira proposes a simpler design for for managing fsimage and edit logs. The edit logs and fsimage scheme in current Hadoop requires coordination and can lead to tricky bugs ( HDFS-955 ). Proposed design All transactions have a transaction ID. A transaction ID is a number that starts at zero and incremented. Each journal record of the editLogs file has the transaction ID. Fsimage file is identified by the transaction ID of the last checkpointed transaction in the file. E.g. fsImage_<transactionIDofLastTranscationChekpointed> An editsLog file is identified by the transaction ID of the first recorded transaction in the file. E.g. fsEditlogs<transactionIdofFirstTransaction> To start the name server, Load the fsImage with the greatest transactionID N. If no image exist, take N to be 0. Process all transactions >N from the editsLog: Find an editsLog that includes transaction with IDs N+1. Process all transactions >= N+1 from that and all subsequent editLogs files. Salient points This scheme does not require any synchronization between when fsImages are checkpointed and editsLogs files are split (although it is convenient if when you checkpoint at transactionID N, then you also spilt your edits logs at N or slightly less). This means that the NameNode and BackupNode can share images and edits without coordination. (This is very different from the current design.). For example the primary NN can decided that it wants a checkpoint and hence split the editLogs and ask the backup NN to do a checkpoint; the checkpoint operation can succeed or fail without worries. (Btw if the split of the editLogs is recorded as the last transaction in the edit logs then the backup NN will see that transaction come across and realize that this is convenient time to checkpoint. The scheme does not require coordination between checkpointing fsImages themselves! For example, while the backup NN is doing a checkpoint, the NN could be asked to do a saveImage by the admin. Policies on how many edits and fsimages to keep is separable.
          Hide
          Sanjay Radia added a comment -

          Todd, Just looked at your idea. Great minds think alike. :- ) (I am sure that most databases probably use something similar because it is a fairly obvious solution.)
          When the backup NN work was done last year, I had proposed roughly that scheme - ie numbering the file incrementally. Rob Chanselor in our team proposed the improvement to use the transaction id. It decouples the split (ie roll) of the edit log and the checkpoint of the image.

          At that time we didn't do it for lack of time. I think we should do this for the next release (ie 22 ). It will simplify the code significantly and also avoid tricky bugs.

          Show
          Sanjay Radia added a comment - Todd, Just looked at your idea. Great minds think alike. :- ) (I am sure that most databases probably use something similar because it is a fairly obvious solution.) When the backup NN work was done last year, I had proposed roughly that scheme - ie numbering the file incrementally. Rob Chanselor in our team proposed the improvement to use the transaction id. It decouples the split (ie roll) of the edit log and the checkpoint of the image. At that time we didn't do it for lack of time. I think we should do this for the next release (ie 22 ). It will simplify the code significantly and also avoid tricky bugs.
          Hide
          dhruba borthakur added a comment -

          Cool stuff. A couple of questions:

          • what are the pros-and-cons of numbering the files sequentially, fsimage_0, fsimage_1, etc vs appending the last known transaction into the filename?
          • this is very different from what we currently got in the trunk. And this is a heavyweight change. You mention that " the primary NN can decide that it wants a checkpoint and hence split the editLogs and ask the backup NN to do a checkpoint;"...... this is not something that happens in the regula course of action, right? If this is truly a rare case, the backup node can simply detect this scenerio and re-sync its entire image from the primary when this occurs?
          Show
          dhruba borthakur added a comment - Cool stuff. A couple of questions: what are the pros-and-cons of numbering the files sequentially, fsimage_0, fsimage_1, etc vs appending the last known transaction into the filename? this is very different from what we currently got in the trunk. And this is a heavyweight change. You mention that " the primary NN can decide that it wants a checkpoint and hence split the editLogs and ask the backup NN to do a checkpoint;"...... this is not something that happens in the regula course of action, right? If this is truly a rare case, the backup node can simply detect this scenerio and re-sync its entire image from the primary when this occurs?
          Hide
          Todd Lipcon added a comment -

          what are the pros-and-cons of numbering the files sequentially, fsimage_0, fsimage_1, etc vs appending the last known transaction into the filename?

          Interesting question. The pro I can think of for sequential numbering (0,1,2...) is that we can determine whether there is a "gap" in edit logs without looking at file contents. For example, if we see edits_0, edits_1, edits_3 we know that this edits directory is corrupt since we missed edits_2. Whereas with txn IDs we can only detect a gap by reading through the entirety of the file and counting transactions.

          The pro of txid numbering is that we can detect the case where some middle log got truncated. For example, if we have edits_0, edits_1000, and edits_2000, but edits_1000 only contains 500 edits, we can fail at that point.

          However, there's nothing stopping us from getting the benefits of both - we could either make the filenames something like edits_<idx>_<first txid>, or just make sure we store the first txid in the header of the edit log.

          Sanjay mentioned "it decouples the split (ie roll) of the edit log and the checkpoint of the image" but I'm not sure what he meant by that. I think we can still achieve the same goal using indexed files, as long as each roll increments the index. So, if we roll three times but only succeed to checkpoint once, we'd see fsimage_0, edits_0, edits_1, edits_2, fsimage_2, edits_3 (where fsimage_0 and edits_0 through edits_2 may be GCed according to ageout policy)

          this is very different from what we currently got in the trunk. And this is a heavyweight change

          Agree this is a large change, however I think it will reduce the amount of complicated statemachine code, and we know there are several very tricky bugs in the trunk implementation. I think this simpler design will be easier to understand and thus harder to write bugs into. Plus, it has the nice property that even if there is a bug it will be very hard to write one that corrupts the data since old versions can be lazily deleted and are never modified after close.

          Show
          Todd Lipcon added a comment - what are the pros-and-cons of numbering the files sequentially, fsimage_0, fsimage_1, etc vs appending the last known transaction into the filename? Interesting question. The pro I can think of for sequential numbering (0,1,2...) is that we can determine whether there is a "gap" in edit logs without looking at file contents. For example, if we see edits_0, edits_1, edits_3 we know that this edits directory is corrupt since we missed edits_2. Whereas with txn IDs we can only detect a gap by reading through the entirety of the file and counting transactions. The pro of txid numbering is that we can detect the case where some middle log got truncated. For example, if we have edits_0, edits_1000, and edits_2000, but edits_1000 only contains 500 edits, we can fail at that point. However, there's nothing stopping us from getting the benefits of both - we could either make the filenames something like edits_<idx>_<first txid>, or just make sure we store the first txid in the header of the edit log. Sanjay mentioned "it decouples the split (ie roll) of the edit log and the checkpoint of the image" but I'm not sure what he meant by that. I think we can still achieve the same goal using indexed files, as long as each roll increments the index. So, if we roll three times but only succeed to checkpoint once, we'd see fsimage_0, edits_0, edits_1, edits_2, fsimage_2, edits_3 (where fsimage_0 and edits_0 through edits_2 may be GCed according to ageout policy) this is very different from what we currently got in the trunk. And this is a heavyweight change Agree this is a large change, however I think it will reduce the amount of complicated statemachine code, and we know there are several very tricky bugs in the trunk implementation. I think this simpler design will be easier to understand and thus harder to write bugs into. Plus, it has the nice property that even if there is a bug it will be very hard to write one that corrupts the data since old versions can be lazily deleted and are never modified after close.
          Hide
          Todd Lipcon added a comment -

          I began working on this tonight and came upon another pro for the sequential numbering of the logs:

          There may be occasional times when we want to roll the edit log when we've not made any edits into it. For example, if the NN has started but no edits have been made yet, and the 2NN wants to start a checkpoint, it will ask the NN to roll the edits log. In the sequential design, this means it will roll from edits_N to edits_N+1 even if edits_N is empty - the 2NN then downloads the empty edits_N and everything works as expected. If we used txids, there would be no new filename to move to, and we'd have to add special logic to detect this case in various places.

          Show
          Todd Lipcon added a comment - I began working on this tonight and came upon another pro for the sequential numbering of the logs: There may be occasional times when we want to roll the edit log when we've not made any edits into it. For example, if the NN has started but no edits have been made yet, and the 2NN wants to start a checkpoint, it will ask the NN to roll the edits log. In the sequential design, this means it will roll from edits_N to edits_N+1 even if edits_N is empty - the 2NN then downloads the empty edits_N and everything works as expected. If we used txids, there would be no new filename to move to, and we'd have to add special logic to detect this case in various places.
          Hide
          Suresh Srinivas added a comment -

          > The pro of txid numbering is that we can detect the case where some middle log got truncated. For example, if we have edits_0, edits_1000, and edits_2000, but edits_1000 only contains 500 edits, we can fail at that point

          We should generate txid serially. With that it would be easy to determine the missing logs, based on gaps in txid, even if we decide on naming the files without txids.

          > "it decouples the split (ie roll) of the edit log and the checkpoint of the image"
          Currently a new edits are rolled when backup node asks the namenode to create one. NN could have a policy where for every N number of transaction, it could choose to roll edits. Backup NN could choose to checkpoint fsimage with the relevant edits that have been rolled so far. This makes rolling of edits a local NN policy and decouples it from backup NN.

          > There may be occasional times when we want to roll the edit log when we've not made any edits into it...
          To avoid this kind of complications - if NN periodically rolls edits, checkpointing could be merging fsimage with all the rolled edits so far. In such a case we need distinction between edits that are finalized and edit that is being written to.

          Show
          Suresh Srinivas added a comment - > The pro of txid numbering is that we can detect the case where some middle log got truncated. For example, if we have edits_0, edits_1000, and edits_2000, but edits_1000 only contains 500 edits, we can fail at that point We should generate txid serially. With that it would be easy to determine the missing logs, based on gaps in txid, even if we decide on naming the files without txids. > "it decouples the split (ie roll) of the edit log and the checkpoint of the image" Currently a new edits are rolled when backup node asks the namenode to create one. NN could have a policy where for every N number of transaction, it could choose to roll edits. Backup NN could choose to checkpoint fsimage with the relevant edits that have been rolled so far. This makes rolling of edits a local NN policy and decouples it from backup NN. > There may be occasional times when we want to roll the edit log when we've not made any edits into it... To avoid this kind of complications - if NN periodically rolls edits, checkpointing could be merging fsimage with all the rolled edits so far. In such a case we need distinction between edits that are finalized and edit that is being written to.
          Hide
          Todd Lipcon added a comment -

          We should generate txid serially. With that it would be easy to determine the missing logs, based on gaps in txid, even if we decide on naming the files without txids.

          But we can only find the gaps by reading the entirety of the files, whereas it's nice to be able to see the gaps as an operator with a simple "js"

          This makes rolling of edits a local NN policy and decouples it from backup NN.

          I don't see how this is any different with txid based naming or sequential naming. In either case we can have the backup/secondary/etc understand how to pull and checkpoint multiple files. The CheckpointSignature just needs to include the information about which logs have been completed.

          In such a case we need distinction between edits that are finalized and edit that is being written to.

          Yes, I came to this same conclusion this morning when thinking about how to safely restore failed storage directories.

          I should have a preliminary patch this week - so far I have the NN side generally working (loading, saveNamespace, restart, etc) but haven't started work on the various checkpoint mechanisms.

          Show
          Todd Lipcon added a comment - We should generate txid serially. With that it would be easy to determine the missing logs, based on gaps in txid, even if we decide on naming the files without txids. But we can only find the gaps by reading the entirety of the files, whereas it's nice to be able to see the gaps as an operator with a simple "js" This makes rolling of edits a local NN policy and decouples it from backup NN. I don't see how this is any different with txid based naming or sequential naming. In either case we can have the backup/secondary/etc understand how to pull and checkpoint multiple files. The CheckpointSignature just needs to include the information about which logs have been completed. In such a case we need distinction between edits that are finalized and edit that is being written to. Yes, I came to this same conclusion this morning when thinking about how to safely restore failed storage directories. I should have a preliminary patch this week - so far I have the NN side generally working (loading, saveNamespace, restart, etc) but haven't started work on the various checkpoint mechanisms.
          Hide
          Sanjay Radia added a comment -

          I forgot to mention other advantages.

          • The image does not need to be sent back to the primary NN via a special mechanism. Once can simply copy it back using any tool.
          • If for HA one wants to keep the images and logs on shared storage (harder for logs) the checkpointer can simply copy the checkpointed image to the shared storage without involving the primary.
          • Rob Chancellor pointed another advantage : a server with large memory can simply run a checkpoint cmd as cron jobs for ALL NNs in the data center (esp useful under federated NNs). The disadvantage is that it would require a separate server; Further, i believe, it simplifies the backup NN code:
            • currently when the backup starts a checkpoint it has to lock the fs state and store the new logs sent by the primary to a special place and then do something special to sync back in; I think this resync would not be necessary if we use a special server to run periodic checkpoints.
          Show
          Sanjay Radia added a comment - I forgot to mention other advantages. The image does not need to be sent back to the primary NN via a special mechanism. Once can simply copy it back using any tool. If for HA one wants to keep the images and logs on shared storage (harder for logs) the checkpointer can simply copy the checkpointed image to the shared storage without involving the primary. Rob Chancellor pointed another advantage : a server with large memory can simply run a checkpoint cmd as cron jobs for ALL NNs in the data center (esp useful under federated NNs). The disadvantage is that it would require a separate server; Further, i believe, it simplifies the backup NN code: currently when the backup starts a checkpoint it has to lock the fs state and store the new logs sent by the primary to a special place and then do something special to sync back in; I think this resync would not be necessary if we use a special server to run periodic checkpoints.
          Hide
          Sanjay Radia added a comment -

          > Sanjay mentioned "it decouples the split (ie roll) of the edit log and the checkpoint of the image" but I'm not sure what he meant by that.
          >I think we can still achieve the same goal using indexed files, ... fsimage_0, edits_0, edits_1, edits_2, fsimage_2

          The serial numbering of the files solution requires that checkpoints occur only at a edits split boundaries.
          The transaction ID one does not have that restriction but it does require that in order to detect a gap in edits one has to look inside the logs. The txId one can avoid that if we are prepared to rename the edits log when you split (roll) it (Ugh!)
          The txId numbering scheme also has the advantage that multiple backups can roll and do checkpoints independently (we DONOT want to do that as it will confuse the operators – but it shows that the design is very robust.

          The options to have any server to a checkpoint is useful. I think both schemes allow that.

          Show
          Sanjay Radia added a comment - > Sanjay mentioned "it decouples the split (ie roll) of the edit log and the checkpoint of the image" but I'm not sure what he meant by that. >I think we can still achieve the same goal using indexed files, ... fsimage_0, edits_0, edits_1, edits_2, fsimage_2 The serial numbering of the files solution requires that checkpoints occur only at a edits split boundaries. The transaction ID one does not have that restriction but it does require that in order to detect a gap in edits one has to look inside the logs. The txId one can avoid that if we are prepared to rename the edits log when you split (roll) it (Ugh!) The txId numbering scheme also has the advantage that multiple backups can roll and do checkpoints independently (we DONOT want to do that as it will confuse the operators – but it shows that the design is very robust. The options to have any server to a checkpoint is useful. I think both schemes allow that.
          Hide
          Robert Chansler added a comment -

          The name of a file is not the issue, but rather knowing what is in the file. Sanjay (in an early comment) stated the correctness rule in terms of TxIDs. Is there a proposal to substitute some other correctness rules? Another rule set seems to be more complex than what Sanjay proposes if it presupposes file name rules and coordination about when particular files are created, written and closed.

          Describing correctness in terms of TxIDs is attractive in that it is then possible to state retention rules for edits and image files conveniently. And it seems to facilitate the charming idea of having a utility service create checkpoints for any number of name servers without coordination/cooperation.

          Show
          Robert Chansler added a comment - The name of a file is not the issue, but rather knowing what is in the file. Sanjay (in an early comment) stated the correctness rule in terms of TxIDs. Is there a proposal to substitute some other correctness rules? Another rule set seems to be more complex than what Sanjay proposes if it presupposes file name rules and coordination about when particular files are created, written and closed. Describing correctness in terms of TxIDs is attractive in that it is then possible to state retention rules for edits and image files conveniently. And it seems to facilitate the charming idea of having a utility service create checkpoints for any number of name servers without coordination/cooperation.
          Hide
          Todd Lipcon added a comment -

          The serial numbering of the files solution requires that checkpoints occur only at a edits split boundaries.

          Yes, but since we can split edits at will, I don't think there's any problem just having the backupnode asking the active NN to roll whenver the BN would like to do a checkpoint. The nice thing about this is that an image file from the BN can be lined up exactly with the corresponding edit logs from the NN, etc.

          The transaction ID one does not have that restriction but it does require that in order to detect a gap in edits one has to look inside the logs. The txId one can avoid that if we are prepared to rename the edits log when you split (roll) it (Ugh!)

          Agreed re ugh! The renaming is the complexity we're trying to avoid, no?

          The txId numbering scheme also has the advantage that multiple backups can roll and do checkpoints independently (we DONOT want to do that as it will confuse the operators – but it shows that the design is very robust.

          I still think this is possible with sequential numbering. And I agree that not confusing operators is a key design goal for this JIRA. The whole image/edit log thing in normal operation should be an implementation detail, and when operators have to look at it they're usually very stressed out because a cluster is corrupt - so we want to make it very clear what's going on, and very hard to create any state that is unrecoverable.

          I've started working on this patch and it's coming along nicely. The NN and secondary NN are working great, and just started on the BN/Checkpointer. Here's a brief overview of the design I'm going with - hopefully I will answer the above questions along the way.

          Storage contents

          The NN storage directories continue to be organized in the same way - either edits, images, or both. The difference is that each edits or fsimage file now has a suffix indicating its "roll index". For example, a newly formatted NN has the following contents:

          • fsimage_0 - empty image
          • edits_0_inprogress - the edit log currently being appended

          When edits are rolled, the current 'edits_N_inprogress' file is "finalized" by renaming to simply edits_N. So, if we roll the edits of the above image, we end up with:

          • fsimage_0 - same empty image
          • edits_0 - any edits made before the roll
          • edits_1_inprogress

          When an image is saved or uploaded via a checkpoint, the validity rule is as follows: any fsimage with roll index N must incorporate all edits from logs with a roll index less than N. So, if we enter safe mode and call saveNamespace on the above example, we end up with:

          • fsimage_0 - original empty imagge
          • edits_0 - edits before first roll
          • edits_1 - edits before saveNamespace
          • fsimage_2 - all edits from edits_0 and edits_1
          • edits_2_inprogress - the edit log where new edits will be appended

          Log Rolling Triggers

          The following events can trigger a log roll:

          • NN startup (see below)
          • saveNamespace
          • a secondary or backup node wants to begin a checkpoint
          • an IOException has occurred on one of the current edit logs
          • potentially we may find it useful to expose this as an admin function? (eg mysql offers a flush logs; command)

          Log rolling behavior:

          • The current edits_N_inprogress log is closed
          • The current edits_N_inprogress log is renamed to edits_N in all valid edits directories.
          • Any edits directories that previously had problems will be left with edits_N_inprogress (since we don't know whether all of the edits made it into that log before the roll, in fact they probably did not)
          • The next edits_N+1_inprogress is opened in all directories, including an attempt to reopen any failed directories.

          Startup behavior

          First we initiate log recovery:

          • Across all edits directories, look for any edits_N_inprogress:
            • If one is found, look for a finalized edits_N file in any other log directory
              • If there is at least one finalized edits_N, then the edits_N_inprogress is likely corrupt – rename it to edits_N_corrupt (or delete it if we are less cautious)
            • If there are no finalized edits_N files, then the NN crashed while we were writing log index N. Initiate recovery process across all edits_N_inprogress:
              • Currently this isn't fancy - I just pick one. However, we could scan each of the logs for OP_INVALID and find the longest one, ensure that they have the same length, etc (eg one log must not have caught the last edit, or been truncated, etc)
              • This is very simple to do since across all directories (including secondaries) edits_M for any M should be identical!
              • After we've determined the correct log(s), finalize it and remove the others

          Next, find the fsimage_N with the highest N across all image directories.
          Then, find the edits_M with the highest M across all edits directories.

          For safety, we check that there exists an edits_X for all X between N and M inclusive.

          We then start up the NN by the following sequence:

          • load fsimage_N
          • for each M through N inclusive, load edits_N
          • if we loaded any edits, save fsimage_N+1
          • open edits_inprogress_N+1

          Checkpoint process

          • Checkpoint Signature is modified to include the latest image index and the current log index in progress.
          • Checkpointing node issues beginCheckpoint to NN
          • NN rolls edit logs, and returns a checkpoint signature that includes the latest stored fsimage_N, as well as the index of the log it just rolled to
          • Image transfer servlet is augmented to allow the downloader to specify which image or edits file to download
          • Checkpointer downloads fsimage_N and edits_N through edits_M (where M is the new finalized edit log from the roll)
          • Checkpointer saves local fsimage_M+1, and uploads to NN
          • NN validation of the checkpoint signature is much simpler - just needs to make sure it came from the same filesystem, check any security tokens, etc. The old fstime and editstime constructs are no longer necessary since it's all encapsulated in the index numbers. For extra safety we can easily add some checksum or log length info to the CheckpointSignature
          • NN saves fsimage_M+1 into its local image dirs, but does not need to do any log manipulation.

          I'm still working out the backupnode operation, but I think it will actually be simplified by this proposal. Rather than having a special journaling mode, I think the NN can simply push any log roll events through the edit log stream to the BN. This will keep the roll indexes (and log contents) on the BN exactly identical to the indexes on the NN, which has good operational advantages and also reduces code complexity in the BN.

          Handling multiple checkpointers

          Note that in the above process there is no state stored on the NN with regard to ongoing checkpoint processes. If multiple checkpoint nodes checkpoint simultaneously, the NN will simply roll twice and hand a different index to each. Each will then upload fsimages with different indexes.

          Image/edits file retention policies

          There are a number of policies that should be simple to implement:

          • Number of saved images - ensure that we have at least N saved images in our image directories, can delete any that are more than N versions old. Maintain edit lots that have index >= the index of the Nth oldest image.
          • Time - ensure that we maintain all images within a trailing time window - again maintain all edit logs with index >= index of oldest maintained image.
          • Archival - for audit purposes, the deletion mechanism could very easily be augmented to archive the edit logs for later analysis (eg to HDFS, tape, SAN, etc)

          So long as any fsimage_N and all edits_M where M >= N are retained somewhere, they can be copied back into the NN's storage directories and full PITR is possible.

          Show
          Todd Lipcon added a comment - The serial numbering of the files solution requires that checkpoints occur only at a edits split boundaries. Yes, but since we can split edits at will, I don't think there's any problem just having the backupnode asking the active NN to roll whenver the BN would like to do a checkpoint. The nice thing about this is that an image file from the BN can be lined up exactly with the corresponding edit logs from the NN, etc. The transaction ID one does not have that restriction but it does require that in order to detect a gap in edits one has to look inside the logs. The txId one can avoid that if we are prepared to rename the edits log when you split (roll) it (Ugh!) Agreed re ugh! The renaming is the complexity we're trying to avoid, no? The txId numbering scheme also has the advantage that multiple backups can roll and do checkpoints independently (we DONOT want to do that as it will confuse the operators – but it shows that the design is very robust. I still think this is possible with sequential numbering. And I agree that not confusing operators is a key design goal for this JIRA. The whole image/edit log thing in normal operation should be an implementation detail, and when operators have to look at it they're usually very stressed out because a cluster is corrupt - so we want to make it very clear what's going on, and very hard to create any state that is unrecoverable. – I've started working on this patch and it's coming along nicely. The NN and secondary NN are working great, and just started on the BN/Checkpointer. Here's a brief overview of the design I'm going with - hopefully I will answer the above questions along the way. Storage contents The NN storage directories continue to be organized in the same way - either edits, images, or both. The difference is that each edits or fsimage file now has a suffix indicating its "roll index". For example, a newly formatted NN has the following contents: fsimage_0 - empty image edits_0_inprogress - the edit log currently being appended When edits are rolled, the current 'edits_N_inprogress' file is "finalized" by renaming to simply edits_N. So, if we roll the edits of the above image, we end up with: fsimage_0 - same empty image edits_0 - any edits made before the roll edits_1_inprogress When an image is saved or uploaded via a checkpoint, the validity rule is as follows: any fsimage with roll index N must incorporate all edits from logs with a roll index less than N. So, if we enter safe mode and call saveNamespace on the above example, we end up with: fsimage_0 - original empty imagge edits_0 - edits before first roll edits_1 - edits before saveNamespace fsimage_2 - all edits from edits_0 and edits_1 edits_2_inprogress - the edit log where new edits will be appended Log Rolling Triggers The following events can trigger a log roll: NN startup (see below) saveNamespace a secondary or backup node wants to begin a checkpoint an IOException has occurred on one of the current edit logs potentially we may find it useful to expose this as an admin function? (eg mysql offers a flush logs; command) Log rolling behavior: The current edits_N_inprogress log is closed The current edits_N_inprogress log is renamed to edits_N in all valid edits directories. Any edits directories that previously had problems will be left with edits_N_inprogress (since we don't know whether all of the edits made it into that log before the roll, in fact they probably did not) The next edits_N+1_inprogress is opened in all directories, including an attempt to reopen any failed directories. Startup behavior First we initiate log recovery: Across all edits directories, look for any edits_N_inprogress: If one is found, look for a finalized edits_N file in any other log directory If there is at least one finalized edits_N, then the edits_N_inprogress is likely corrupt – rename it to edits_N_corrupt (or delete it if we are less cautious) If there are no finalized edits_N files, then the NN crashed while we were writing log index N. Initiate recovery process across all edits_N_inprogress: Currently this isn't fancy - I just pick one. However, we could scan each of the logs for OP_INVALID and find the longest one, ensure that they have the same length, etc (eg one log must not have caught the last edit, or been truncated, etc) This is very simple to do since across all directories (including secondaries) edits_M for any M should be identical! After we've determined the correct log(s), finalize it and remove the others Next, find the fsimage_N with the highest N across all image directories. Then, find the edits_M with the highest M across all edits directories. For safety, we check that there exists an edits_X for all X between N and M inclusive. We then start up the NN by the following sequence: load fsimage_N for each M through N inclusive, load edits_N if we loaded any edits, save fsimage_N+1 open edits_inprogress_N+1 Checkpoint process Checkpoint Signature is modified to include the latest image index and the current log index in progress. Checkpointing node issues beginCheckpoint to NN NN rolls edit logs, and returns a checkpoint signature that includes the latest stored fsimage_N, as well as the index of the log it just rolled to Image transfer servlet is augmented to allow the downloader to specify which image or edits file to download Checkpointer downloads fsimage_N and edits_N through edits_M (where M is the new finalized edit log from the roll) Checkpointer saves local fsimage_M+1, and uploads to NN NN validation of the checkpoint signature is much simpler - just needs to make sure it came from the same filesystem, check any security tokens, etc. The old fstime and editstime constructs are no longer necessary since it's all encapsulated in the index numbers. For extra safety we can easily add some checksum or log length info to the CheckpointSignature NN saves fsimage_M+1 into its local image dirs, but does not need to do any log manipulation. I'm still working out the backupnode operation, but I think it will actually be simplified by this proposal. Rather than having a special journaling mode, I think the NN can simply push any log roll events through the edit log stream to the BN. This will keep the roll indexes (and log contents) on the BN exactly identical to the indexes on the NN, which has good operational advantages and also reduces code complexity in the BN. Handling multiple checkpointers Note that in the above process there is no state stored on the NN with regard to ongoing checkpoint processes. If multiple checkpoint nodes checkpoint simultaneously, the NN will simply roll twice and hand a different index to each. Each will then upload fsimages with different indexes. Image/edits file retention policies There are a number of policies that should be simple to implement: Number of saved images - ensure that we have at least N saved images in our image directories, can delete any that are more than N versions old. Maintain edit lots that have index >= the index of the Nth oldest image. Time - ensure that we maintain all images within a trailing time window - again maintain all edit logs with index >= index of oldest maintained image. Archival - for audit purposes, the deletion mechanism could very easily be augmented to archive the edit logs for later analysis (eg to HDFS, tape, SAN, etc) So long as any fsimage_N and all edits_M where M >= N are retained somewhere, they can be copied back into the NN's storage directories and full PITR is possible.
          Hide
          Konstantin Shvachko added a comment -

          A question:
          I have one NN storage directory, which is image and edits. I have image file fsimage_N in it and edits_N_inprogress. And I also observed on the previous start up that NN failed in the middle of writing file fsimage_N. How do I recover from it?
          Another similar scenario is when a checkpointer starts to upload the new image to NN but fails in the middle. How do you recognize that the latest image file is incomplete and the previous version should be used instead?

          Show
          Konstantin Shvachko added a comment - A question: I have one NN storage directory, which is image and edits. I have image file fsimage_N in it and edits_N_inprogress. And I also observed on the previous start up that NN failed in the middle of writing file fsimage_N. How do I recover from it? Another similar scenario is when a checkpointer starts to upload the new image to NN but fails in the middle. How do you recognize that the latest image file is incomplete and the previous version should be used instead?
          Hide
          Todd Lipcon added a comment -

          Hi Konstantin,

          Thanks for the question. I'm maintaining the existing fsimage_ckpt_N naming for the upload in progress. So at startup, that file is removed during recovery. The old image fsimage_M where M < N would still exist, so that would be used for recovery.

          The same's true for checkpoint uploads - if it fails in the middle it wouldn't be renamed yet, so we'd recover from an older fsimage.

          Show
          Todd Lipcon added a comment - Hi Konstantin, Thanks for the question. I'm maintaining the existing fsimage_ckpt_N naming for the upload in progress. So at startup, that file is removed during recovery. The old image fsimage_M where M < N would still exist, so that would be used for recovery. The same's true for checkpoint uploads - if it fails in the middle it wouldn't be renamed yet, so we'd recover from an older fsimage.
          Hide
          Sanjay Radia added a comment -

          Todd, thanks for the design. Before you move forward too much on the patch, I would like to get consensus on the 2 alternate designs.
          Further I think we need to add some items to the design doc. (Given that we may go through 2, 3 versions of the doc would it be better to attach it rather then post in inline? ).

          Please add the following items to your design doc:

          • BNN restarts - how does it sync up? What if we have multiple BNNs?
          • Checkpoint:
            • Concurrent checkpoints (saveImage and checkpointer)
            • Checkpoint done in the BNN which is also applying the edits stream to its state - Does the notion of spooling in the current design change?)
            • Explore the notion of having checkpoints done offline - this is not targeted for the next release but something that
              we may want down the road; we need to evaluate the designs against this. (of course we also need to evaluate whether or not offline checkpoints are a good idea in the first place.)
          • Managing edits and images in an HA environment. Here the idea is to move the image and edits to shared storage and treat the
            NN as "diskless". This is esp useful for federation when there are mulitple NNs. Moving/writing the image to shared storage is not difficult and it avoids the need to send the image back to the primary NN. Moving the edits to share storage is hard because of the latency requirements. Here book-keeper can come to the rescue; I don't see any other solutions so far.

          I am not proposing very detailed design of the above items since we don't have the resources to do all that. However as we evaluate the 2 alternate design lets use the above items to guide us.

          Show
          Sanjay Radia added a comment - Todd, thanks for the design. Before you move forward too much on the patch, I would like to get consensus on the 2 alternate designs. Further I think we need to add some items to the design doc. (Given that we may go through 2, 3 versions of the doc would it be better to attach it rather then post in inline? ). Please add the following items to your design doc: BNN restarts - how does it sync up? What if we have multiple BNNs? Checkpoint: Concurrent checkpoints (saveImage and checkpointer) Checkpoint done in the BNN which is also applying the edits stream to its state - Does the notion of spooling in the current design change?) Explore the notion of having checkpoints done offline - this is not targeted for the next release but something that we may want down the road; we need to evaluate the designs against this. (of course we also need to evaluate whether or not offline checkpoints are a good idea in the first place.) Managing edits and images in an HA environment. Here the idea is to move the image and edits to shared storage and treat the NN as "diskless". This is esp useful for federation when there are mulitple NNs. Moving/writing the image to shared storage is not difficult and it avoids the need to send the image back to the primary NN. Moving the edits to share storage is hard because of the latency requirements. Here book-keeper can come to the rescue; I don't see any other solutions so far. I am not proposing very detailed design of the above items since we don't have the resources to do all that. However as we evaluate the 2 alternate design lets use the above items to guide us.
          Hide
          Sanjay Radia added a comment -

          ># Checkpointing node issues beginCheckpoint to NN
          This is backwards (also in the original secondary NN). The NN, not the BNN, should decide when to do the checkpoint.
          Further the NN should tell one of the BNN that he is the checkpointer and simplly write a special transaction in the edit logs
          "edits is rolled". This special record indicates to the BNN, if he is also playing the role of the checkpointer, that a checkpoint should be performed.
          This has the nice advantage: the last record of rolled edits has a special record at the end indicating a clean roll or a clean shutdown.

          The above works for BOTH designs being propsed (serial # and txid).

          Show
          Sanjay Radia added a comment - ># Checkpointing node issues beginCheckpoint to NN This is backwards (also in the original secondary NN). The NN, not the BNN, should decide when to do the checkpoint. Further the NN should tell one of the BNN that he is the checkpointer and simplly write a special transaction in the edit logs "edits is rolled". This special record indicates to the BNN, if he is also playing the role of the checkpointer, that a checkpoint should be performed. This has the nice advantage: the last record of rolled edits has a special record at the end indicating a clean roll or a clean shutdown. The above works for BOTH designs being propsed (serial # and txid).
          Hide
          Konstantin Shvachko added a comment -

          I did not find it in the design above, so I asked.
          Now the next step is to write out a step-by-step procedure say for saveNamespace() and outline how you recover from failures on different stages. I mean that with the new approach you still need to rename fsimage_ckpt_N+1 to fsimage_N+1 and at the same time create new edits_N+1_inprogress. And this is not atomic. We can start from a single directory, and then make sure it works in multi- and split- directory cases.
          I am trying to understand how much simpler it gets.

          Show
          Konstantin Shvachko added a comment - I did not find it in the design above, so I asked. Now the next step is to write out a step-by-step procedure say for saveNamespace() and outline how you recover from failures on different stages. I mean that with the new approach you still need to rename fsimage_ckpt_N+1 to fsimage_N+1 and at the same time create new edits_N+1_inprogress. And this is not atomic. We can start from a single directory, and then make sure it works in multi- and split- directory cases. I am trying to understand how much simpler it gets.
          Hide
          Sanjay Radia added a comment -

          >Please add the following items to your design doc:
          One more item that Konstantine mentioned at lunch: upgrade - what if anything needs to change?

          >Concurrent checkpoints (saveImage and checkpointer)
          Oops, you have already covered it.

          Show
          Sanjay Radia added a comment - >Please add the following items to your design doc: One more item that Konstantine mentioned at lunch: upgrade - what if anything needs to change? >Concurrent checkpoints (saveImage and checkpointer) Oops, you have already covered it.
          Hide
          Todd Lipcon added a comment -

          Hi Sanjay/Konstantin,

          Thanks for the comments and questions. I didn't originally anticipate writing a design doc inline, but you know, fingers started typing and a few pages later it was a very long JIRA comment I'll do another rev to address your questions as well as flesh out the BN and Upgrade bits, and upload it as an attachment here some time tomorrow.

          Show
          Todd Lipcon added a comment - Hi Sanjay/Konstantin, Thanks for the comments and questions. I didn't originally anticipate writing a design doc inline, but you know, fingers started typing and a few pages later it was a very long JIRA comment I'll do another rev to address your questions as well as flesh out the BN and Upgrade bits, and upload it as an attachment here some time tomorrow.
          Hide
          Mahadev konar added a comment -

          I havent been able to read through all the comments, so pardon me if my comments do not make much sense. Regarding using sequence numbers for naming edits and snapshots versus using transactions ids, I would like to put forth a few reasons it has been really useful for us in ZooKeeper to use transaction ids:

          • checks for missing transactions. With file names as edits_txid and snashot_txid its very easy to check if there was any missing transaction.
          • debugging and finding the transaction you need is very easy. Lets say you want to dump the transaction logs starting from transaction X. With the above scheme it becomes very easy to search for the right transaction logs to start dumping from wherein no files need to be opened to check what transactions they might have.

          hopefully this helps.

          Show
          Mahadev konar added a comment - I havent been able to read through all the comments, so pardon me if my comments do not make much sense. Regarding using sequence numbers for naming edits and snapshots versus using transactions ids, I would like to put forth a few reasons it has been really useful for us in ZooKeeper to use transaction ids: checks for missing transactions. With file names as edits_txid and snashot_txid its very easy to check if there was any missing transaction. debugging and finding the transaction you need is very easy. Lets say you want to dump the transaction logs starting from transaction X. With the above scheme it becomes very easy to search for the right transaction logs to start dumping from wherein no files need to be opened to check what transactions they might have. hopefully this helps.
          Hide
          Todd Lipcon added a comment -

          Here's a more fleshed out design document, including BN operation, upgrade, and some of the future direction design sketches. I tried to address the questions above - I'll take another pass through it tomorrow to see if I missed any of the comments.

          I'll also work on adding a section to compare and contrast the two alternatives that we're still debating - ie file names with transaction IDs vs file names with sequential numbering.

          Thanks for all the feedback!

          Show
          Todd Lipcon added a comment - Here's a more fleshed out design document, including BN operation, upgrade, and some of the future direction design sketches. I tried to address the questions above - I'll take another pass through it tomorrow to see if I missed any of the comments. I'll also work on adding a section to compare and contrast the two alternatives that we're still debating - ie file names with transaction IDs vs file names with sequential numbering. Thanks for all the feedback!
          Hide
          Sanjay Radia added a comment -

          Here is what I remember from our meeting in April. Todd, you took notes, please add anything I missed.
          There were 2 issues under contention:

          1. Add transaction Id to the edit logs
          2. Name the edit logs and image logs using the transaction id.

          These are orthogonal to each other.

          Main advantage of adding transaction id to edit logs has following advantages (only the first advantage was discussed at the meeting, I am adding the other two)

          • when a snapshot of a NN state is taken one can record the Tid for the snapshot - this is useful for knowwing the diff between two snapshots etc.
          • while writing edit logs to multiple files, a failure of the th system can result in different amounts of data written to each file - the tid allows one to pick one with the most tranasactions.
          • In order to do an offline fsck one can needs to dump the block map; clearly one does not want to the local the system to do an atomic dump. The transaction id of when the dump is started can be written in the dump to allow the fsck to report consistently.

          Main disadvantage is that the editlogs will be little bigger.

          Main disadvantage of Naming the edit logs using transaction ids is that the the edit logs reader needs to be able to seek forward to a specific transaction id. The advantages have been discussed above; I will summarize in the separate comment.

          Show
          Sanjay Radia added a comment - Here is what I remember from our meeting in April. Todd, you took notes, please add anything I missed. There were 2 issues under contention: Add transaction Id to the edit logs Name the edit logs and image logs using the transaction id. These are orthogonal to each other. Main advantage of adding transaction id to edit logs has following advantages (only the first advantage was discussed at the meeting, I am adding the other two) when a snapshot of a NN state is taken one can record the Tid for the snapshot - this is useful for knowwing the diff between two snapshots etc. while writing edit logs to multiple files, a failure of the th system can result in different amounts of data written to each file - the tid allows one to pick one with the most tranasactions. In order to do an offline fsck one can needs to dump the block map; clearly one does not want to the local the system to do an atomic dump. The transaction id of when the dump is started can be written in the dump to allow the fsck to report consistently. Main disadvantage is that the editlogs will be little bigger. Main disadvantage of Naming the edit logs using transaction ids is that the the edit logs reader needs to be able to seek forward to a specific transaction id. The advantages have been discussed above; I will summarize in the separate comment.
          Hide
          Mahadev konar added a comment -

          sanjay,
          I dont understand the disadvantage you are quoting here. As far as I see being able to seek to a specific transaction quickly (which the snapshot log with txnid enable u to do) is a good thing!

          Is it the minimum set of code changes that is making you guys reject on the txn based snapshots and logging? As far as I read Todd's description, using transaction ids and naming the edit logs and image using the transaction id's enable you to all those recoveries stated in Todd's document!

          Show
          Mahadev konar added a comment - sanjay, I dont understand the disadvantage you are quoting here. As far as I see being able to seek to a specific transaction quickly (which the snapshot log with txnid enable u to do) is a good thing! Is it the minimum set of code changes that is making you guys reject on the txn based snapshots and logging? As far as I read Todd's description, using transaction ids and naming the edit logs and image using the transaction id's enable you to all those recoveries stated in Todd's document!
          Hide
          Todd Lipcon added a comment -

          Hey Sanjay,

          Thanks for reviving this. The notes you wrote above seem accurate.

          Couple of questions:

          while writing edit logs to multiple files, a failure of the th system can result in different amounts of data written to each file - the tid allows one to pick one with the most tranasactions.

          Isn't this also doable by just seeing which as more non-zero bytes? ie seek to the end of the file, scan backwards through the 0 bytes, and stop. Whichever valid log is longer wins. Even in the case with the transaction-id, you have to do something like this for a few reasons: a) we'd rather scan backward from the end of the edit log than forward from the beginning, since it's going to be a faster startup, and b) even if we see a higher transaction id header on the last entry, that entry might have been incompletely written to the file, so we still have to verify that it deserializes correctly.

          Main disadvantage is that the editlogs will be little bigger.

          So are you suggesting that each edit will include a header with the transaction ID in it? Isn't this redundant if the header of the whole edit file has the starting txid – ie is there ever a case where we'd skip a txid?

          In order to do an offline fsck one can needs to dump the block map; clearly one does not want to the local the system to do an atomic dump. The transaction id of when the dump is started can be written in the dump to allow the fsck to report consistently.

          Sorry, can you elaborate a little bit here? In order to get a consistent dump of the block map don't we need to take the FSN lock and thus stall all operations? Is the idea that the BackupNode would do the blockmap dump offline since it can hold a lock for some time without stalling clients? If that's the case, what's the purpose of the offline nature of the fsck instead of just having BackupNode allow fsck to point directly at it and access memory under the same lock?

          Mahadev said:

          Is it the minimum set of code changes that is making you guys reject on the txn based snapshots and logging?

          I don't think either way has been decided/rejected yet. What you're saying has been my view - that doing txid based is a bigger change, since we have to introduce the txid concept and add extra code that allows replaying partial edit log files (ie a subrange of the edits within). But it's certainly doable and Sanjay has presented some good advantages.

          Show
          Todd Lipcon added a comment - Hey Sanjay, Thanks for reviving this. The notes you wrote above seem accurate. Couple of questions: while writing edit logs to multiple files, a failure of the th system can result in different amounts of data written to each file - the tid allows one to pick one with the most tranasactions. Isn't this also doable by just seeing which as more non-zero bytes? ie seek to the end of the file, scan backwards through the 0 bytes, and stop. Whichever valid log is longer wins. Even in the case with the transaction-id, you have to do something like this for a few reasons: a) we'd rather scan backward from the end of the edit log than forward from the beginning, since it's going to be a faster startup, and b) even if we see a higher transaction id header on the last entry, that entry might have been incompletely written to the file, so we still have to verify that it deserializes correctly. Main disadvantage is that the editlogs will be little bigger. So are you suggesting that each edit will include a header with the transaction ID in it? Isn't this redundant if the header of the whole edit file has the starting txid – ie is there ever a case where we'd skip a txid? In order to do an offline fsck one can needs to dump the block map; clearly one does not want to the local the system to do an atomic dump. The transaction id of when the dump is started can be written in the dump to allow the fsck to report consistently. Sorry, can you elaborate a little bit here? In order to get a consistent dump of the block map don't we need to take the FSN lock and thus stall all operations? Is the idea that the BackupNode would do the blockmap dump offline since it can hold a lock for some time without stalling clients? If that's the case, what's the purpose of the offline nature of the fsck instead of just having BackupNode allow fsck to point directly at it and access memory under the same lock? Mahadev said: Is it the minimum set of code changes that is making you guys reject on the txn based snapshots and logging? I don't think either way has been decided/rejected yet. What you're saying has been my view - that doing txid based is a bigger change, since we have to introduce the txid concept and add extra code that allows replaying partial edit log files (ie a subrange of the edits within). But it's certainly doable and Sanjay has presented some good advantages.
          Hide
          Ivan Kelly added a comment -

          while writing edit logs to multiple files, a failure of the th system can result in different amounts of data written to each file - the tid allows one to pick one with the most tranasactions.

          Isn't this also doable by just seeing which as more non-zero bytes? ie seek to the end of the file, scan backwards through the 0 bytes, and stop. Whichever valid log is longer wins. Even in the case with the transaction-id, you have to do something like this for a few reasons: a) we'd rather scan backward from the end of the edit log than forward from the beginning, since it's going to be a faster startup, and b) even if we see a higher transaction id header on the last entry, that entry might have been incompletely written to the file, so we still have to verify that it deserializes correctly.

          The case of all edit logs being _inprogress during a crash should be a very rare case. Is it really an issue if it takes a little longer to determine which has the most transactions if it's only going to incurred after a bad crash?

          I don't think either way has been decided/rejected yet. What you're saying has been my view - that doing txid based is a bigger change, since we have to introduce the txid concept and add extra code that allows replaying partial edit log files (ie a subrange of the edits within). But it's certainly doable and Sanjay has presented some good advantages.

          FSEditLog already has a transaction concept which could be modified for this. Currently its not stored anywhere, but is used for logSync. It starts and 0 at NN startup and increases monotonically, reseting the next time NN starts.

          Show
          Ivan Kelly added a comment - while writing edit logs to multiple files, a failure of the th system can result in different amounts of data written to each file - the tid allows one to pick one with the most tranasactions. Isn't this also doable by just seeing which as more non-zero bytes? ie seek to the end of the file, scan backwards through the 0 bytes, and stop. Whichever valid log is longer wins. Even in the case with the transaction-id, you have to do something like this for a few reasons: a) we'd rather scan backward from the end of the edit log than forward from the beginning, since it's going to be a faster startup, and b) even if we see a higher transaction id header on the last entry, that entry might have been incompletely written to the file, so we still have to verify that it deserializes correctly. The case of all edit logs being _inprogress during a crash should be a very rare case. Is it really an issue if it takes a little longer to determine which has the most transactions if it's only going to incurred after a bad crash? I don't think either way has been decided/rejected yet. What you're saying has been my view - that doing txid based is a bigger change, since we have to introduce the txid concept and add extra code that allows replaying partial edit log files (ie a subrange of the edits within). But it's certainly doable and Sanjay has presented some good advantages. FSEditLog already has a transaction concept which could be modified for this. Currently its not stored anywhere, but is used for logSync. It starts and 0 at NN startup and increases monotonically, reseting the next time NN starts.
          Hide
          Ivan Kelly added a comment -

          I've been working on Todds code to bring it up to date with trunk. Currently I've got it as far as passing all the smoke tests.

          http://github.com/ivankelly/hadoop-hdfs/tree/hdfs-1073

          It would be good if we could get a consensus on which numbering approach to take, so I can attack that problem before getting the tests up to 100%.

          Show
          Ivan Kelly added a comment - I've been working on Todds code to bring it up to date with trunk. Currently I've got it as far as passing all the smoke tests. http://github.com/ivankelly/hadoop-hdfs/tree/hdfs-1073 It would be good if we could get a consensus on which numbering approach to take, so I can attack that problem before getting the tests up to 100%.
          Hide
          Sanjay Radia added a comment -

          >> In order to do an offline fsck one can needs to dump the block map; ...
          >Sorry, can you elaborate a little bit here? ...

          This has nothing to do with Backup namenode. Currently the fsck is implemented inside the
          NN. We would like do this offiline. So one could do a dump of the block map and at the start of the dump record the transaction id. I believe that with this one would not have lock the FSNamspace.

          The above is just one use case of the transaction id. There are others. For example, during
          failover the transaction id would be useful for determining on one has the latest edits.

          Show
          Sanjay Radia added a comment - >> In order to do an offline fsck one can needs to dump the block map; ... >Sorry, can you elaborate a little bit here? ... This has nothing to do with Backup namenode. Currently the fsck is implemented inside the NN. We would like do this offiline. So one could do a dump of the block map and at the start of the dump record the transaction id. I believe that with this one would not have lock the FSNamspace. The above is just one use case of the transaction id. There are others. For example, during failover the transaction id would be useful for determining on one has the latest edits.
          Hide
          Sanjay Radia added a comment -

          > So are you suggesting that each edit will include a header with the transaction ID in it? ..
          Actually I was, but your suggestion works.
          However, having a txid in each record helps sanity checks and debugging. There have been cases where we have got the transactions reversed in the past.
          The main cost is reading the extra 8 bytes and de-serializing the number.
          (BTW zookeeper does put in the txid in each editlog record).

          Show
          Sanjay Radia added a comment - > So are you suggesting that each edit will include a header with the transaction ID in it? .. Actually I was, but your suggestion works. However, having a txid in each record helps sanity checks and debugging. There have been cases where we have got the transactions reversed in the past. The main cost is reading the extra 8 bytes and de-serializing the number. (BTW zookeeper does put in the txid in each editlog record).
          Hide
          Todd Lipcon added a comment -

          Here's a very preliminary patch for discussion, basically my work from this spring rebased on trunk with the generous help of Ivan K. This applies on top of HDFS-1462. It's not necessarily passing tests yet, and doesn't implement Sanjay's txid idea. It also needs plenty of cleanup

          I intend to work on this over the next several days and post a better version, so no need to jump on this one and review unless you are particularly interested. I also pushed to my github at: http://github.com/toddlipcon/hadoop-hdfs/tree/hdfs-1073-october

          Show
          Todd Lipcon added a comment - Here's a very preliminary patch for discussion, basically my work from this spring rebased on trunk with the generous help of Ivan K. This applies on top of HDFS-1462 . It's not necessarily passing tests yet, and doesn't implement Sanjay's txid idea. It also needs plenty of cleanup I intend to work on this over the next several days and post a better version, so no need to jump on this one and review unless you are particularly interested. I also pushed to my github at: http://github.com/toddlipcon/hadoop-hdfs/tree/hdfs-1073-october
          Hide
          Todd Lipcon added a comment -

          I'm starting to think about how to convert the current code over to the txid based numbering, came upon a design point I wanted to discuss here:

          What should we do about the case when the edits should be rolled, but there have been no transactions since the last roll? For example, consider the following sequence:
          1) Start up a fresh NN. We are writing to file edits_0-inprogress
          2) Perform 100 edits- now current txid is 100.
          3) Perform a roll. This renames edits_0-inprogress to edits_0-100 and opens edits_100-inprogress
          4) No more edits, but a new BN starts up, and thus asks for another roll. Thus we would like to create edits_100-100, a file with no edits, which is a little bit strange, and will cause issues the next time we roll (we'll end up with edits_100-100 and also edits_100-200 for example)

          It seems the options are:
          a) if asked to roll when we have not written any transactions to our current log, it is a no-op
          b) whenever we roll, we append a special "trailer" transaction. Thus every log has at least 1 edit in it. I don't really like this, since it means that after a crash, we'll have a log without a trailer, which will add edge cases to worry about.

          I'm leaning towards A. Am I missing another good solution?

          Show
          Todd Lipcon added a comment - I'm starting to think about how to convert the current code over to the txid based numbering, came upon a design point I wanted to discuss here: What should we do about the case when the edits should be rolled, but there have been no transactions since the last roll? For example, consider the following sequence: 1) Start up a fresh NN. We are writing to file edits_0-inprogress 2) Perform 100 edits- now current txid is 100. 3) Perform a roll. This renames edits_0-inprogress to edits_0-100 and opens edits_100-inprogress 4) No more edits, but a new BN starts up, and thus asks for another roll. Thus we would like to create edits_100-100, a file with no edits, which is a little bit strange, and will cause issues the next time we roll (we'll end up with edits_100-100 and also edits_100-200 for example) It seems the options are: a) if asked to roll when we have not written any transactions to our current log, it is a no-op b) whenever we roll, we append a special "trailer" transaction. Thus every log has at least 1 edit in it. I don't really like this, since it means that after a crash, we'll have a log without a trailer, which will add edge cases to worry about. I'm leaning towards A. Am I missing another good solution?
          Hide
          Todd Lipcon added a comment -

          Converted HDFS-259 as a subtask. While we are cleaning up and redoing this section of the code, it will make things clearer to get rid of the code pertaining to ancient layout versions.

          Show
          Todd Lipcon added a comment - Converted HDFS-259 as a subtask. While we are cleaning up and redoing this section of the code, it will make things clearer to get rid of the code pertaining to ancient layout versions.
          Hide
          dhruba borthakur added a comment -

          I prefer option (a).

          Show
          dhruba borthakur added a comment - I prefer option (a).
          Hide
          Suresh Srinivas added a comment -

          +1 for option A.

          Show
          Suresh Srinivas added a comment - +1 for option A.
          Hide
          Robert Chansler added a comment -

          Is the file 100-100 forbidden? What if the service is stopped when the most recent file has zero records? (I'd always write a "I'm quitting" record, otherwise you can never know if you have lost the last edits.) And what if there are files 100-200 and 100-300? Rather than different special cases, why not make the general case just work? Roll means roll regardless, and starting up finds the latest image and any consistent sequence of edits that starts with includes the very next transaction, reporting whether the last available edit record is "I'm quitting!".

          And catching up with Sanjay's comment about tx ids in every record, it would seem that the principal benefits are really obtained only if the tx id is assigned to requests as they are received in sequence. Just doing log.write(id++) doesn't offer much real protection.

          If there is a tx id per record, would it make sense for the actual bits be the record check sum+id? Years ago we discussed having record check sums, but it never became a priority.

          (In file N-M, I might have expected that the first record, if any, has tx id N, not N+1.)

          Show
          Robert Chansler added a comment - Is the file 100-100 forbidden ? What if the service is stopped when the most recent file has zero records? (I'd always write a "I'm quitting" record, otherwise you can never know if you have lost the last edits.) And what if there are files 100-200 and 100-300? Rather than different special cases, why not make the general case just work? Roll means roll regardless, and starting up finds the latest image and any consistent sequence of edits that starts with includes the very next transaction, reporting whether the last available edit record is "I'm quitting!". And catching up with Sanjay's comment about tx ids in every record, it would seem that the principal benefits are really obtained only if the tx id is assigned to requests as they are received in sequence . Just doing log.write(id++) doesn't offer much real protection. If there is a tx id per record, would it make sense for the actual bits be the record check sum+id? Years ago we discussed having record check sums, but it never became a priority. (In file N-M, I might have expected that the first record, if any, has tx id N, not N+1.)
          Hide
          Todd Lipcon added a comment -

          Hi Rob. You raise a good point. I think we'd have to do something where shutting down the NN with 100-inprogress would result in 100-100, and the NN would reopen that file as 100-inprogress upon restart. This seems messy to me - I would love to keep an invariant that once a file is "finalized" it is never renamed or changes contents.

          Show
          Todd Lipcon added a comment - Hi Rob. You raise a good point. I think we'd have to do something where shutting down the NN with 100-inprogress would result in 100-100, and the NN would reopen that file as 100-inprogress upon restart. This seems messy to me - I would love to keep an invariant that once a file is "finalized" it is never renamed or changes contents.
          Hide
          Sanjay Radia added a comment -

          > 4) No more edits, but a new BN starts up, and thus asks for another roll.
          A backup NN should not ask for a roll. The primary should roll when it feels it is necessary.

          Show
          Sanjay Radia added a comment - > 4) No more edits, but a new BN starts up, and thus asks for another roll. A backup NN should not ask for a roll. The primary should roll when it feels it is necessary.
          Hide
          Konstantin Shvachko added a comment -
          • Sanjay, Todd means that when a checkpoint starts it triggers rollEdits(), which is the cut off point for the new checkpoint. The checkpoint of course can use the latest rolled edits instead, but then you get the problem of synchronizing the start of a checkpoint and the edits roll event. Otherwise checkpoints may become way behind the current namespace state.
          • I agree with Rob that edits_100-100 should not be a special case to avoid. In practice we will not see it, but if it happens the system should just absorb it. Todd correctly points out that if the system is idle for a very long time NN may try to create edits_100-100 the second time, but this could be just avoided based on name collision.
          • I do not see or did not understand the rational for "I'm quitting!" record. Why should NN care whether last record was lost or not, just keep going with what it has. Worked so far.
          Show
          Konstantin Shvachko added a comment - Sanjay, Todd means that when a checkpoint starts it triggers rollEdits(), which is the cut off point for the new checkpoint. The checkpoint of course can use the latest rolled edits instead, but then you get the problem of synchronizing the start of a checkpoint and the edits roll event. Otherwise checkpoints may become way behind the current namespace state. I agree with Rob that edits_100-100 should not be a special case to avoid. In practice we will not see it, but if it happens the system should just absorb it. Todd correctly points out that if the system is idle for a very long time NN may try to create edits_100-100 the second time, but this could be just avoided based on name collision. I do not see or did not understand the rational for "I'm quitting!" record. Why should NN care whether last record was lost or not, just keep going with what it has. Worked so far.
          Hide
          Konstantin Shvachko added a comment -

          Todd, I briefly looked at the patch. It looks like you are trying to get rid of the Journal Spool in BN. Correct me if I am wrong. I don't think you can. BN makes a checkpoint from its memory state, which differs it from SNN and CN. While it does it, the namespace should be locked (for modifications), so the edits go into journal spool, which is reapplied to memory after the checkpoint is finished. Please see the design doc in HADOOP-4539.

          Show
          Konstantin Shvachko added a comment - Todd, I briefly looked at the patch. It looks like you are trying to get rid of the Journal Spool in BN. Correct me if I am wrong. I don't think you can. BN makes a checkpoint from its memory state, which differs it from SNN and CN. While it does it, the namespace should be locked (for modifications), so the edits go into journal spool, which is reapplied to memory after the checkpoint is finished. Please see the design doc in HADOOP-4539 .
          Hide
          Sanjay Radia added a comment -

          >... but then you get the problem of synchronizing the start of a checkpoint and the edits roll event. Otherwise checkpoints may become way behind the current namespace state.
          I guess I am missing this.
          We should avoid the synchronization that has been there in the original design of the secondary NN.
          The BN can checkpoint whenever it feels that the set of rolled edits since previous checkpoint is large enough. It may be simpler to do it on every roll if we have configured the
          NN to roll say every 10K transactions.
          Perhaps what I am proposing works for the checkpointer but not for the BN because of some property of the BN that I am missing.

          Show
          Sanjay Radia added a comment - >... but then you get the problem of synchronizing the start of a checkpoint and the edits roll event. Otherwise checkpoints may become way behind the current namespace state. I guess I am missing this. We should avoid the synchronization that has been there in the original design of the secondary NN. The BN can checkpoint whenever it feels that the set of rolled edits since previous checkpoint is large enough. It may be simpler to do it on every roll if we have configured the NN to roll say every 10K transactions. Perhaps what I am proposing works for the checkpointer but not for the BN because of some property of the BN that I am missing.
          Hide
          Sanjay Radia added a comment -

          >.. not understand the rational for "I'm quitting!" record. Why should NN care whether last record was lost or not, just keep going with what it has.
          The quitting record basically shows that the NN did a shutdown and did not die. This is useful to know. The NN will still continue to keep going as before.

          If we were to add a similar "rolled" transaction at the end of every roll then we could avoid the edits_100-100 since it will become edits_100-101.
          Also the "rolled" transaction is a nice way to to tell the BN that the primary did a roll without any special message from NN to BNN.

          Show
          Sanjay Radia added a comment - >.. not understand the rational for "I'm quitting!" record. Why should NN care whether last record was lost or not, just keep going with what it has. The quitting record basically shows that the NN did a shutdown and did not die. This is useful to know. The NN will still continue to keep going as before. If we were to add a similar "rolled" transaction at the end of every roll then we could avoid the edits_100-100 since it will become edits_100-101. Also the "rolled" transaction is a nice way to to tell the BN that the primary did a roll without any special message from NN to BNN.
          Hide
          Todd Lipcon added a comment -

          Hey all. Back in town after a few weeks in Japan, sorry for the relative absence.

          I do not see or did not understand the rational for "I'm quitting!" record. Why should NN care whether last record was lost or not, just keep going with what it has. Worked so far.

          I think one complication here is that we currently never have to re-open an edits file for append, since when we start, we always save a "fresh" checkpoint image and empty "edits" if there were any edits to apply. One advantage of the new design is that we no longer have to do this - we just bump the edits log number to the next one in sequence - ie we roll on startup if the latest edit log is non-empty.

          Also the "rolled" transaction is a nice way to to tell the BN that the primary did a roll without any special message from NN to BNN

          The patch currently does exactly that - we just don't write down the special "roll" entry in any file streams. We certainly could, though, if it's useful to know that a file was completely written.

          Todd, I briefly looked at the patch. It looks like you are trying to get rid of the Journal Spool in BN. Correct me if I am wrong. I don't think you can

          In the patch, the spooling has just become a bit more of a general case. Rather than spooling to a special file, we simply ask the primary NN to roll, and then wait for the roll to happen. While waiting for the roll, we continue to apply edits. One we get the special "roll" record, we stop applying edits and make a checkpoint at that point. Once the checkpoint completes, we "converge" by continuing to read forward in the sequence of log files until we hit the end and are back "in sync"

          A backup NN should not ask for a roll. The primary should roll when it feels it is necessary.

          I think the simplest will be if anyone may ask for a roll - ie CN, BN, or NN. The NN of course is the one that actually makes the decision, but the decision may be in response to a request from one of the other nodes. I think this ability is useful not just for CN,BN, and NN, but also for example in backup scripts - you may ask the NN to roll right before making a tarball of the edits directory, and thus be sure that you get all of the current edits in "finalized" files.

          Show
          Todd Lipcon added a comment - Hey all. Back in town after a few weeks in Japan, sorry for the relative absence. I do not see or did not understand the rational for "I'm quitting!" record. Why should NN care whether last record was lost or not, just keep going with what it has. Worked so far. I think one complication here is that we currently never have to re-open an edits file for append, since when we start, we always save a "fresh" checkpoint image and empty "edits" if there were any edits to apply. One advantage of the new design is that we no longer have to do this - we just bump the edits log number to the next one in sequence - ie we roll on startup if the latest edit log is non-empty. Also the "rolled" transaction is a nice way to to tell the BN that the primary did a roll without any special message from NN to BNN The patch currently does exactly that - we just don't write down the special "roll" entry in any file streams. We certainly could, though, if it's useful to know that a file was completely written. Todd, I briefly looked at the patch. It looks like you are trying to get rid of the Journal Spool in BN. Correct me if I am wrong. I don't think you can In the patch, the spooling has just become a bit more of a general case. Rather than spooling to a special file, we simply ask the primary NN to roll, and then wait for the roll to happen. While waiting for the roll, we continue to apply edits. One we get the special "roll" record, we stop applying edits and make a checkpoint at that point. Once the checkpoint completes, we "converge" by continuing to read forward in the sequence of log files until we hit the end and are back "in sync" A backup NN should not ask for a roll. The primary should roll when it feels it is necessary. I think the simplest will be if anyone may ask for a roll - ie CN, BN, or NN. The NN of course is the one that actually makes the decision, but the decision may be in response to a request from one of the other nodes. I think this ability is useful not just for CN,BN, and NN, but also for example in backup scripts - you may ask the NN to roll right before making a tarball of the edits directory, and thus be sure that you get all of the current edits in "finalized" files.
          Hide
          Robert Chansler added a comment -

          Worked so far.

          How would you know? I just feel better having some check that the log is complete, especially in the new world where the log is a sequence of files. It's conceivable that not only could the last log file be truncated, any number of log files at the end of the log could be missing entirely. Of course, if the log files were being written to a more robust file system like HDFS, the need for integrity checks would be less.

          Show
          Robert Chansler added a comment - Worked so far. How would you know? I just feel better having some check that the log is complete, especially in the new world where the log is a sequence of files. It's conceivable that not only could the last log file be truncated, any number of log files at the end of the log could be missing entirely. Of course, if the log files were being written to a more robust file system like HDFS, the need for integrity checks would be less.
          Hide
          Konstantin Shvachko added a comment -

          Rob, I seems we cannot have the "I'm quitting!" record just because there is no "quit" or "shutdown" command. I agree the "rolled" transaction can be useful for a sanity check for the edits files that are not in-progress.

          Todd, based on the design doc (I should have read first thing) I don't see much difference between the current and your new implementation ascept that you don't need a side file to write the edits while spooling. Currently BN.startCheckpoint() causes NN.rollEdits(), which in turn sends back to BN the SPOOL_START record. This is when BN starts spooling. You seem to be trying to call the process of spooling (writing into edits file but not applying to memory) by journal. That is how the state is called in your design, right? Which may be confusing as BN continues journaling (writing to edits file) whether it is in synchronized or in spooling mode.
          Also I don't see how you can get by with only 2 states for BN you need 3. While spooling there are 2 active threads: one (writer) is writing edits from NN directly to the edits_K, another (reader) is reading formerly written records from edits_K. At the end we need to switch the writer thread from writing to applying the records to in memory state and shut down the second thread. This is where you need the third state, currently called WAIT. When the reader thread reaches the end of file it sets the WAIT state. The writer may still be writing before it sees the WAIT state. After seeing WAIT it blocks and waits until spooling is OFF. The reader read the remaining records and turns the spooling OFF.
          Let me summarize the current journal spool states meaning

          • JSpoolState.OFF - spooling is off, apply edits to memory state and write into journal (edits file)
          • JSpoolState.INPROGRESS - spooling in progress, do not apply to memory, just journal
          • JSpoolState.WAIT - stop, do nothing wait until spooling is OFF

          Does that make sense?

          Show
          Konstantin Shvachko added a comment - Rob, I seems we cannot have the "I'm quitting!" record just because there is no "quit" or "shutdown" command. I agree the "rolled" transaction can be useful for a sanity check for the edits files that are not in-progress. Todd, based on the design doc (I should have read first thing) I don't see much difference between the current and your new implementation ascept that you don't need a side file to write the edits while spooling. Currently BN.startCheckpoint() causes NN.rollEdits(), which in turn sends back to BN the SPOOL_START record. This is when BN starts spooling. You seem to be trying to call the process of spooling (writing into edits file but not applying to memory) by journal. That is how the state is called in your design, right? Which may be confusing as BN continues journaling (writing to edits file) whether it is in synchronized or in spooling mode. Also I don't see how you can get by with only 2 states for BN you need 3. While spooling there are 2 active threads: one (writer) is writing edits from NN directly to the edits_K, another (reader) is reading formerly written records from edits_K. At the end we need to switch the writer thread from writing to applying the records to in memory state and shut down the second thread. This is where you need the third state, currently called WAIT. When the reader thread reaches the end of file it sets the WAIT state. The writer may still be writing before it sees the WAIT state. After seeing WAIT it blocks and waits until spooling is OFF. The reader read the remaining records and turns the spooling OFF. Let me summarize the current journal spool states meaning JSpoolState.OFF - spooling is off, apply edits to memory state and write into journal (edits file) JSpoolState.INPROGRESS - spooling in progress, do not apply to memory, just journal JSpoolState.WAIT - stop, do nothing wait until spooling is OFF Does that make sense?
          Hide
          Sanjay Radia added a comment -

          > if anyone may ask for a roll - ie CN, BN, or NN. ...
          This is a little ambiguous. Let's clarify:

          • Longer Term - we may need to some time to get here since we want to minimize the changes to the BNN protocol.
            • An admin can ask for a roll.
            • NN does a roll when the size of the edits is big - say every 10K operations (a configurable parameter).
            • A NewCheckpointer which is given a set of fsimages and edits and it creates a new checkpointed fsimage. The new fsimage is then copied OFFLINE to the NN, or wherever else we want it (say NFS, or HDFS). ie there is NO protocol between the NN and the NewCheckpointer. When this NewCheckpointer is available we can deprecate the old CheckpointNN (CN).
            • BNN does NOT ask for a roll - it simply observes the rolls by the "roll transaction". - Does this work or have I misunderstood the
              design of the BNN?
          • Shorter term (ie this Jira and release 22)
            • An admin can ask for a roll
            • NN does a roll when the size of the edits is big - say every 10K operations (a configurable parameter).
            • BNN can ask for roll since this is already part of the protocol – btw if this can be avoided in this jira then good but it may be too much change.
            • The CN (really a variation of the BNN) can ask for a roll.
          Show
          Sanjay Radia added a comment - > if anyone may ask for a roll - ie CN, BN, or NN. ... This is a little ambiguous. Let's clarify: Longer Term - we may need to some time to get here since we want to minimize the changes to the BNN protocol. An admin can ask for a roll. NN does a roll when the size of the edits is big - say every 10K operations (a configurable parameter). A NewCheckpointer which is given a set of fsimages and edits and it creates a new checkpointed fsimage. The new fsimage is then copied OFFLINE to the NN, or wherever else we want it (say NFS, or HDFS). ie there is NO protocol between the NN and the NewCheckpointer. When this NewCheckpointer is available we can deprecate the old CheckpointNN (CN). BNN does NOT ask for a roll - it simply observes the rolls by the "roll transaction". - Does this work or have I misunderstood the design of the BNN? Shorter term (ie this Jira and release 22) An admin can ask for a roll NN does a roll when the size of the edits is big - say every 10K operations (a configurable parameter). BNN can ask for roll since this is already part of the protocol – btw if this can be avoided in this jira then good but it may be too much change. The CN (really a variation of the BNN) can ask for a roll.
          Hide
          Sanjay Radia added a comment -

          On the "roll transaction" and the "quit transaction" we can add info such as # of files, dirs, blocks etc in the NN.
          This can be useful for sanity checking and testing.

          Show
          Sanjay Radia added a comment - On the "roll transaction" and the "quit transaction" we can add info such as # of files, dirs, blocks etc in the NN. This can be useful for sanity checking and testing.
          Hide
          Todd Lipcon added a comment -

          For anyone following along at home, I have most of the algorithms done for inspecting a set of storage dirs and deciding which sequence of images/logs to load. Attaching patch which applies on top of HDFS-1521 and HDFS-1538 but isn't actually integrated into FSImage yet - been relying on true unit tests for this so far.

          Show
          Todd Lipcon added a comment - For anyone following along at home, I have most of the algorithms done for inspecting a set of storage dirs and deciding which sequence of images/logs to load. Attaching patch which applies on top of HDFS-1521 and HDFS-1538 but isn't actually integrated into FSImage yet - been relying on true unit tests for this so far.
          Hide
          Todd Lipcon added a comment -

          Progress is coming along on this issue. I've pushed a git branch for the work-in-progress here: https://github.com/toddlipcon/hadoop-hdfs/tree/hdfs-1073-march

          I've based this branch on top of HDFS-1521 and HDFS-1538

          Show
          Todd Lipcon added a comment - Progress is coming along on this issue. I've pushed a git branch for the work-in-progress here: https://github.com/toddlipcon/hadoop-hdfs/tree/hdfs-1073-march I've based this branch on top of HDFS-1521 and HDFS-1538
          Hide
          Sanjay Radia added a comment -

          Noticed the progress on HDFS-1521.
          Todd, are you planning to add a subtask where the actual edits and fsimage files are named using the txId or will this be part of this Jira itself? Any intermediate patch on this part for review?

          Show
          Sanjay Radia added a comment - Noticed the progress on HDFS-1521 . Todd, are you planning to add a subtask where the actual edits and fsimage files are named using the txId or will this be part of this Jira itself? Any intermediate patch on this part for review?
          Hide
          Todd Lipcon added a comment -

          As discussed on the mailing list, I've created a branch for this JIRA and its subtasks. This will make intermediate review easier. Any interested parties, please watch the branch and the subtasks of this JIRA.

          Show
          Todd Lipcon added a comment - As discussed on the mailing list, I've created a branch for this JIRA and its subtasks. This will make intermediate review easier. Any interested parties, please watch the branch and the subtasks of this JIRA.
          Hide
          stack added a comment -

          Just to say that I've started following along on this issue but its kinda hard to figure the plan reading the above comments alone (I'm sure I missed a few of the switchbacks reading through). Any chance of the design doc. getting updated to reflect what was agreed – it doesn't seem to match – and whats being pursued out on the branch? Thanks.

          Show
          stack added a comment - Just to say that I've started following along on this issue but its kinda hard to figure the plan reading the above comments alone (I'm sure I missed a few of the switchbacks reading through). Any chance of the design doc. getting updated to reflect what was agreed – it doesn't seem to match – and whats being pursued out on the branch? Thanks.
          Hide
          stack added a comment -

          Oh, I'm asking because I'm trying to help out.

          Show
          stack added a comment - Oh, I'm asking because I'm trying to help out.
          Hide
          Sharad Agarwal added a comment -

          I am out till April 11 and will get back to you on my return. Thanks

          Show
          Sharad Agarwal added a comment - I am out till April 11 and will get back to you on my return. Thanks
          Hide
          Todd Lipcon added a comment -

          Updated most of the design doc to reflect the transaction-ID based naming. There are a few bits (eg test plan and backupnode operation) that still need to be updated to reflect the current plan, but this should be helpful to reviewers looking at the branch as it is today.

          Show
          Todd Lipcon added a comment - Updated most of the design doc to reflect the transaction-ID based naming. There are a few bits (eg test plan and backupnode operation) that still need to be updated to reflect the current plan, but this should be helpful to reviewers looking at the branch as it is today.
          Hide
          Todd Lipcon added a comment -

          For those following the work on this branch: I will be on vacation tomorrow through 4/18, and back on this as a top priority starting 4/19. Thanks!

          Show
          Todd Lipcon added a comment - For those following the work on this branch: I will be on vacation tomorrow through 4/18, and back on this as a top priority starting 4/19. Thanks!
          Hide
          Todd Lipcon added a comment -

          I'm back from vacation and just merged trunk into the development branch, so that the branch compiles again.

          Show
          Todd Lipcon added a comment - I'm back from vacation and just merged trunk into the development branch, so that the branch compiles again.
          Hide
          Todd Lipcon added a comment -

          Status update: Merged federation into this branch. Next subtasks up and ready for commit are HDFS-1892, HDFS-1799, HDFS-1800, HDFS-1801.

          Show
          Todd Lipcon added a comment - Status update: Merged federation into this branch. Next subtasks up and ready for commit are HDFS-1892 , HDFS-1799 , HDFS-1800 , HDFS-1801 .
          Hide
          Eli Collins added a comment -

          Hey Todd,

          Here's my feedback on the design doc. Great writeup.

          • Putting txn IDs in the file names does feel like the right call
          • Would be helpful to have a short problem statement section, eg why the new structure is less error prone, how enabling shared-storage for metadata enbales HA, etc.
          • Section 1.2, is the new OP_INVALID filler relevant here since a journal will also contain these txns in practice?
          • Section 3.1, part 5, think this is useful to expose as an admin function. It's good to decouple log rolling from saving the namespace from an administrator's perspective.
          • Section 4.1, one sentence defining log recovery would be helpful.
          • Section 4.4 Step 5. I think should load/apply edits_inprogress_Q, not just open it.
          • Section 4.5. For your open question, I don't think we should support upgrade from a namespace that was not cleanly shut down. Ie let's restrict the space of logs an upgrade needs to deal with, the admin start and cleanly shutdown the Namenode before upgrading, which seems reasonable to require, and should be the common case anyway.
          • Section 6, bullets 2 and 4, should we use CheckpointNode here and throughout the doc to be consistent?
          • Section 6, bullet 8, can remove, this is already done.
          • Section 7.1, link is broken.
          • Section 7.6, s/will be/will/
          • Section 8, clarify that N is the number of past images, there also needs to be N saved images if given N image directories.
          • Section 8, bullet 3. Strongly agree, the rolling should't be articulated in terms of file deletion, ie something generic like move, archive, or "trash" seems better.
          • Section 9.1. Agree the BN should trigger checkpoints via an op. Please add a note as to why that's better than the current approach.
          • Section 9.2. The CheckpointNode could be modified to use this tool, this will help make it more generally useful, eg for performing checkpointing for any number of namenodes, vs being a CheckpointNode for a given Namenode.

          Since the 2NN has been deprecated and replaced I think we can remove it in a future release (eg 23), should we file a jira for that?

          Thanks,
          Eli

          Show
          Eli Collins added a comment - Hey Todd, Here's my feedback on the design doc. Great writeup. Putting txn IDs in the file names does feel like the right call Would be helpful to have a short problem statement section, eg why the new structure is less error prone, how enabling shared-storage for metadata enbales HA, etc. Section 1.2, is the new OP_INVALID filler relevant here since a journal will also contain these txns in practice? Section 3.1, part 5, think this is useful to expose as an admin function. It's good to decouple log rolling from saving the namespace from an administrator's perspective. Section 4.1, one sentence defining log recovery would be helpful. Section 4.4 Step 5. I think should load/apply edits_inprogress_Q, not just open it. Section 4.5. For your open question, I don't think we should support upgrade from a namespace that was not cleanly shut down. Ie let's restrict the space of logs an upgrade needs to deal with, the admin start and cleanly shutdown the Namenode before upgrading, which seems reasonable to require, and should be the common case anyway. Section 6, bullets 2 and 4, should we use CheckpointNode here and throughout the doc to be consistent? Section 6, bullet 8, can remove, this is already done. Section 7.1, link is broken. Section 7.6, s/will be/will/ Section 8, clarify that N is the number of past images, there also needs to be N saved images if given N image directories. Section 8, bullet 3. Strongly agree, the rolling should't be articulated in terms of file deletion, ie something generic like move, archive, or "trash" seems better. Section 9.1. Agree the BN should trigger checkpoints via an op. Please add a note as to why that's better than the current approach. Section 9.2. The CheckpointNode could be modified to use this tool, this will help make it more generally useful, eg for performing checkpointing for any number of namenodes, vs being a CheckpointNode for a given Namenode. Since the 2NN has been deprecated and replaced I think we can remove it in a future release (eg 23), should we file a jira for that? Thanks, Eli
          Hide
          Todd Lipcon added a comment -

          Thanks for the comments. I'll address them and upload a new draft.

          Since the 2NN has been deprecated and replaced I think we can remove it in a future release (eg 23), should we file a jira for that?

          Yes, I think so. In the code currently, the CN and the 2NN are essentially different implementations of the exact same thing. I can't think of any reason that an operator would want to run the old implementation. Removing the 2NN would also allow us to concentrate our testing on just one of the implementations (right now the CN isn't well covered by tests)

          Sanjay, do you agree?

          Show
          Todd Lipcon added a comment - Thanks for the comments. I'll address them and upload a new draft. Since the 2NN has been deprecated and replaced I think we can remove it in a future release (eg 23), should we file a jira for that? Yes, I think so. In the code currently, the CN and the 2NN are essentially different implementations of the exact same thing. I can't think of any reason that an operator would want to run the old implementation. Removing the 2NN would also allow us to concentrate our testing on just one of the implementations (right now the CN isn't well covered by tests) Sanjay, do you agree?
          Hide
          Todd Lipcon added a comment -

          Updated version of the design doc based on feedback from Eli and Sanjay.

          A few responses to your points:

          Section 1.2, is the new OP_INVALID filler relevant here since a journal will also contain these txns in practice?

          I don't think it's relevant to this document, since it has always existed. The recent change was just that we fill with OP_INVALID all the way to the end of the file, whereas before it just used to be a single OP_INVALID followed by 0x00s.

          Section 4.5. For your open question, I don't think we should support upgrade from a namespace that was not cleanly shut down. Ie let's restrict the space of logs an upgrade needs to deal with, the admin start and cleanly shutdown the Namenode before upgrading, which seems reasonable to require, and should be the common case anyway.

          Actually, in the way this branch has progressed, we're maintaining today's behavior of supporting unclean upgrade just fine. We should of course QA it, but it didn't turn out to be particular complex after refactoring the startup process.

          Show
          Todd Lipcon added a comment - Updated version of the design doc based on feedback from Eli and Sanjay. A few responses to your points: Section 1.2, is the new OP_INVALID filler relevant here since a journal will also contain these txns in practice? I don't think it's relevant to this document, since it has always existed. The recent change was just that we fill with OP_INVALID all the way to the end of the file, whereas before it just used to be a single OP_INVALID followed by 0x00s. Section 4.5. For your open question, I don't think we should support upgrade from a namespace that was not cleanly shut down. Ie let's restrict the space of logs an upgrade needs to deal with, the admin start and cleanly shutdown the Namenode before upgrading, which seems reasonable to require, and should be the common case anyway. Actually, in the way this branch has progressed, we're maintaining today's behavior of supporting unclean upgrade just fine. We should of course QA it, but it didn't turn out to be particular complex after refactoring the startup process.
          Hide
          Todd Lipcon added a comment -

          For those who want a diffable view of the evolution of the document, I pushed my repository here:
          https://github.com/toddlipcon/hdfs-1073-design/commits/master

          Show
          Todd Lipcon added a comment - For those who want a diffable view of the evolution of the document, I pushed my repository here: https://github.com/toddlipcon/hdfs-1073-design/commits/master
          Hide
          Eli Collins added a comment -

          How about uploading the tex file to jira, easier for others to diff drafts and make edits.

          Show
          Eli Collins added a comment - How about uploading the tex file to jira, easier for others to diff drafts and make edits.
          Hide
          Todd Lipcon added a comment -

          Sure, here's the latex source.

          Show
          Todd Lipcon added a comment - Sure, here's the latex source.
          Hide
          Sanjay Radia added a comment -

          Todd, very good document; the effort you have put in clearly shows.
          It will serve as design doc for this very critical part of the NN.
          I have a few minor suggestions to improve the document.

          • Add motivation section (this jira has most of the stuff you need). I would include the following
            • Decouple the image and edits file naming to reduce code complexity when a new checkpoint is added
            • Allow for secondary and backup NN to generate a checkpoint without coordinating with NN.
            • Allow for NN to trigger the checkpoint rather then the secondary/backup
            • Allow one to implement an offline checkpointer
          • Mention somewhere that we need to add NN shutdown command. Show that your design can accommodate it. I would prefer that
            in the case of shutdown, no edits log file is created and hence we could have the option of not worrying about changes in edits opcodes during upgrade (see the discussion on HDFS-1822)
          • Section 4.4 - "group" you really mean "the edits across all dirs" - clarify.

          4.5 - "Open question - upgrade when there wasn't a clean shutdown" – INMHO No we do not need to support it. I prefer
          that there should be a clean shutdown (see my comment about) - not just save namespace.

          I will comment separately on the test cases.

          Show
          Sanjay Radia added a comment - Todd, very good document; the effort you have put in clearly shows. It will serve as design doc for this very critical part of the NN. I have a few minor suggestions to improve the document. Add motivation section (this jira has most of the stuff you need). I would include the following Decouple the image and edits file naming to reduce code complexity when a new checkpoint is added Allow for secondary and backup NN to generate a checkpoint without coordinating with NN. Allow for NN to trigger the checkpoint rather then the secondary/backup Allow one to implement an offline checkpointer Mention somewhere that we need to add NN shutdown command. Show that your design can accommodate it. I would prefer that in the case of shutdown, no edits log file is created and hence we could have the option of not worrying about changes in edits opcodes during upgrade (see the discussion on HDFS-1822 ) Section 4.4 - "group" you really mean "the edits across all dirs" - clarify. 4.5 - "Open question - upgrade when there wasn't a clean shutdown" – INMHO No we do not need to support it. I prefer that there should be a clean shutdown (see my comment about) - not just save namespace. I will comment separately on the test cases.
          Hide
          Konstantin Shvachko added a comment -

          I would also like to ask for some benchmarks to make sure we do not loose in performance for NN operations.
          NNThorughtput is applicable in the case. But other tests are welcome as well.

          Show
          Konstantin Shvachko added a comment - I would also like to ask for some benchmarks to make sure we do not loose in performance for NN operations. NNThorughtput is applicable in the case. But other tests are welcome as well.
          Hide
          Todd Lipcon added a comment -

          Konstantin: good idea about running NNThroughputBenchmark. Do you have a preferred configuration you can suggest (ie in terms of number of threads, etc?) Initial results indicate there's a few % slowdown for operations which sync edit logs, because the edit log entries are now each 8 bytes longer given they include a transaction ID. Read operations seem unaffected, given the changes don't touch those code paths.

          Show
          Todd Lipcon added a comment - Konstantin: good idea about running NNThroughputBenchmark. Do you have a preferred configuration you can suggest (ie in terms of number of threads, etc?) Initial results indicate there's a few % slowdown for operations which sync edit logs, because the edit log entries are now each 8 bytes longer given they include a transaction ID. Read operations seem unaffected, given the changes don't touch those code paths.
          Hide
          Todd Lipcon added a comment -

          Hairong asked me to comment describing what testing I've done on the branch. Here's a summary:

          • Lots of new unit tests – about 3600 lines of net new test code, ~1000 lines updated. Total of 56 new test cases by my grepping.
          • Stress testing of 2NNs:
            • Case 1: Start one NN with two data dirs. Start two 2NNs configured with checkpoint period of 0 (checkpoint as fast as possible). Let it loop for several hours to make sure nothing crashes.
            • Case 2: Start one NN with two data dirs, one of which is on a filesystem mounted on top of software RAID configured in "faulty" mode. Set the "faulty" RAID driver to throw an IO error every 10,000 reads. Start 2NN with checkpoint period 0, run for several minutes, making sure the injected IO errors are handled correctly. Eventually the ext3 filesystem ends up remounting itself as read-only. fsck and remount the filesystem while the NN is running, make sure it can be restored correctly
            • Both of the above tests are run while a separate program with 10 threads pounds "mkdirs" and "delete" calls into the NN as fast as it can.
          • Stress testing of BN:
          • Start NN. Start load generator (spamming mkdirs and delete calls)
          • Start BN with checkpoint configured once a minute.
          • Periodically stop load generator, issue mkdirs on NN and BN and make sure results are identical.
          • Take md5sum of files in BN's name dir, NN's namedir - verify that MD5s match.
          • Resume load generation.

          The above testing yielded a couple of bugs which I then converted to functional tests to prevent regressions.

          Show
          Todd Lipcon added a comment - Hairong asked me to comment describing what testing I've done on the branch. Here's a summary: Lots of new unit tests – about 3600 lines of net new test code, ~1000 lines updated. Total of 56 new test cases by my grepping. Stress testing of 2NNs: Case 1: Start one NN with two data dirs. Start two 2NNs configured with checkpoint period of 0 (checkpoint as fast as possible). Let it loop for several hours to make sure nothing crashes. Case 2: Start one NN with two data dirs, one of which is on a filesystem mounted on top of software RAID configured in "faulty" mode. Set the "faulty" RAID driver to throw an IO error every 10,000 reads. Start 2NN with checkpoint period 0, run for several minutes, making sure the injected IO errors are handled correctly. Eventually the ext3 filesystem ends up remounting itself as read-only. fsck and remount the filesystem while the NN is running, make sure it can be restored correctly Both of the above tests are run while a separate program with 10 threads pounds "mkdirs" and "delete" calls into the NN as fast as it can. Stress testing of BN: Start NN. Start load generator (spamming mkdirs and delete calls) Start BN with checkpoint configured once a minute. Periodically stop load generator, issue mkdirs on NN and BN and make sure results are identical. Take md5sum of files in BN's name dir, NN's namedir - verify that MD5s match. Resume load generation. The above testing yielded a couple of bugs which I then converted to functional tests to prevent regressions.
          Hide
          Jitendra Nath Pandey added a comment -

          A few comments:
          1. EditLogFileInputStream doesn't have any change except for an unused import.
          2. EditLogOutputStream.java : abstract void write(byte[] data, int i, int length)
          All transactions should have a txid, therefore this write method is confusing. I guess it would be cleaned up with backup node fix. Please change the parameter name 'i' to offset.
          3. FSEditLog.java:
          What is the reason to persist start and end of log segments? Do we really
          need OP_START_LOG_SEGMENT and OP_END_LOG_SEGMENT?
          4. FSEditLogOp.java

          • LogHeader has a read method but not a write. Will it make sense to encapsulate both read and write of the header in the same class?
            5. NNStorage.java
          • writeTransactionIdFileToStorage: The transaction id will be persisted along with the image and log files. For a running namenode, it will be in the in-memory state. It is not clear to me why do we need to persist a txid marker separately.
            6. There are unused imports in a few files.
            7. I have a few concerns related to FSImageTransactionalStorageInspector, FSEditLogLoader, but those parts have been addressed in HDFS-2018. I recommend to commit HDFS-2018 in the branch as it significantly improves some parts of the code.
          Show
          Jitendra Nath Pandey added a comment - A few comments: 1. EditLogFileInputStream doesn't have any change except for an unused import. 2. EditLogOutputStream.java : abstract void write(byte[] data, int i, int length) All transactions should have a txid, therefore this write method is confusing. I guess it would be cleaned up with backup node fix. Please change the parameter name 'i' to offset. 3. FSEditLog.java: What is the reason to persist start and end of log segments? Do we really need OP_START_LOG_SEGMENT and OP_END_LOG_SEGMENT? 4. FSEditLogOp.java LogHeader has a read method but not a write. Will it make sense to encapsulate both read and write of the header in the same class? 5. NNStorage.java writeTransactionIdFileToStorage: The transaction id will be persisted along with the image and log files. For a running namenode, it will be in the in-memory state. It is not clear to me why do we need to persist a txid marker separately. 6. There are unused imports in a few files. 7. I have a few concerns related to FSImageTransactionalStorageInspector, FSEditLogLoader, but those parts have been addressed in HDFS-2018 . I recommend to commit HDFS-2018 in the branch as it significantly improves some parts of the code.
          Hide
          Todd Lipcon added a comment -

          EditLogFileInputStream doesn't have any change except for an unused import.

          good catch, fixed the import

          EditLogOutputStream.java : abstract void write(byte[] data, int i, int length)
          All transactions should have a txid, therefore this write method is confusing.

          Agreed. This is used by the BackupNode which currently receives only byte arrays which have to be journaled, rather than logical transaction records. I added a javadoc which explains its purpose, and renamed the offset parameter.

          What is the reason to persist start and end of log segments? Do we really
          need OP_START_LOG_SEGMENT and OP_END_LOG_SEGMENT?

          I remember discussing this at one point on JIRA, but I can't seem to find the comment. I think it was either Sanjay or Rob Chanselor who had suggested that we later extend these opcodes to have a bit of extra information such as the timestamp, the hostname, the namespace ID, etc. They would serve as extra sanity checks and possibly be useful for debug/audit/etc.

          Of course right now they don't do a whole lot, but I think they are still useful during "forensics" – eg when I'm looking at a log file in a hex editor, it would be nice to see one of these transactions at the end to know that it didn't somehow get truncated. Race condition bugs around rolling, like we've seen before, would also be a lot more obvious.

          LogHeader has a read method but not a write. Will it make sense to encapsulate both read and write of the header in the same class?

          Agreed - Ivan has opened HDFS-2149 - I'd propose we do that under that JIRA?

          writeTransactionIdFileToStorage: The transaction id will be persisted along with the image and log files. For a running namenode, it will be in the in-memory state. It is not clear to me why do we need to persist a txid marker separately

          This was added in HDFS-1801, with the rationale in this comment. Basically it adds an extra safeguard so that if the last edit logs are somehow lost (or unavailable at startup), the storage directories will have enough info to detect it and prevent the NN from starting.

          There are unused imports in a few files.

          Yep, thanks. Attached patch fixes most of them.

          I have a few concerns related to FSImageTransactionalStorageInspector, FSEditLogLoader, but those parts have been addressed in HDFS-2018. I recommend to commit HDFS-2018 in the branch as it significantly improves some parts of the code.

          Let's continue to discuss there.

          I addressed the unused imports and javadoc fixes on the branch in r1146889.

          Show
          Todd Lipcon added a comment - EditLogFileInputStream doesn't have any change except for an unused import. good catch, fixed the import EditLogOutputStream.java : abstract void write(byte[] data, int i, int length) All transactions should have a txid, therefore this write method is confusing. Agreed. This is used by the BackupNode which currently receives only byte arrays which have to be journaled, rather than logical transaction records. I added a javadoc which explains its purpose, and renamed the offset parameter. What is the reason to persist start and end of log segments? Do we really need OP_START_LOG_SEGMENT and OP_END_LOG_SEGMENT? I remember discussing this at one point on JIRA, but I can't seem to find the comment. I think it was either Sanjay or Rob Chanselor who had suggested that we later extend these opcodes to have a bit of extra information such as the timestamp, the hostname, the namespace ID, etc. They would serve as extra sanity checks and possibly be useful for debug/audit/etc. Of course right now they don't do a whole lot, but I think they are still useful during "forensics" – eg when I'm looking at a log file in a hex editor, it would be nice to see one of these transactions at the end to know that it didn't somehow get truncated. Race condition bugs around rolling, like we've seen before, would also be a lot more obvious. LogHeader has a read method but not a write. Will it make sense to encapsulate both read and write of the header in the same class? Agreed - Ivan has opened HDFS-2149 - I'd propose we do that under that JIRA? writeTransactionIdFileToStorage: The transaction id will be persisted along with the image and log files. For a running namenode, it will be in the in-memory state. It is not clear to me why do we need to persist a txid marker separately This was added in HDFS-1801 , with the rationale in this comment . Basically it adds an extra safeguard so that if the last edit logs are somehow lost (or unavailable at startup), the storage directories will have enough info to detect it and prevent the NN from starting. There are unused imports in a few files. Yep, thanks. Attached patch fixes most of them. I have a few concerns related to FSImageTransactionalStorageInspector, FSEditLogLoader, but those parts have been addressed in HDFS-2018 . I recommend to commit HDFS-2018 in the branch as it significantly improves some parts of the code. Let's continue to discuss there. I addressed the unused imports and javadoc fixes on the branch in r1146889.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-1073-branch #9 (See https://builds.apache.org/job/Hadoop-Hdfs-1073-branch/9/)
          Small javadoc and unused imports cleanup in response to Jitendra's review

          See https://issues.apache.org/jira/browse/HDFS-1073?focusedCommentId=13064221&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13064221
          Merge trunk into HDFS-1073
          HDFS-2135. Fix regression of HDFS-1955 in HDFS-1073 branch. Contributed by Todd Lipcon.
          HDFS-2133. Address remaining TODOs and pre-merge cleanup on HDFS-1073 branch. Contributed by Todd Lipcon.
          Amend HDFS-2011 for HDFS-1073 branch. Update test cases for new behavior of EditLogFileOutputStream. Contributed by Todd Lipcon and Eli Collins.

          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1146889
          Files :

          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileInputStream.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogOutputStream.java

          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1146881
          Files :

          • /hadoop/common/branches/HDFS-1073/hdfs/src/webapps/datanode
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/Host2NodesMap.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSShell.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestHost2NodesMap.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/contrib/hdfsproxy
          • /hadoop/common/branches/HDFS-1073/hdfs/CHANGES.txt
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/DFSInputStream.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/DecommissionManager.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataXceiverServer.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestWriteRead.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/DFSOutputStream.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/UpgradeObjectDatanode.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/Host2NodesMap.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailureToleration.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/webapps/hdfs
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileOutputStream.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogFileOutputStream.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/webapps/secondary
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceScanner.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/DFSClient.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java
          • /hadoop/common/branches/HDFS-1073/hdfs
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/DecommissionManager.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/c++/libhdfs
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/datanode/TestDatanodeRegister.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/blockmanagement/TestHost2NodesMap.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/common/IncorrectVersionException.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/cli/testHDFSConf.xml

          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1146864
          Files :

          • /hadoop/common/branches/HDFS-1073/hdfs/CHANGES.HDFS-1073.txt
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java

          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1146856
          Files :

          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImageTransactionalStorageInspector.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/JournalManager.java
          • /hadoop/common/branches/HDFS-1073/hdfs/CHANGES.HDFS-1073.txt
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/CheckpointCommand.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileOutputStream.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/NamenodeRegistration.java

          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1146848
          Files :

          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestCheckpoint.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogFileOutputStream.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-1073-branch #9 (See https://builds.apache.org/job/Hadoop-Hdfs-1073-branch/9/ ) Small javadoc and unused imports cleanup in response to Jitendra's review See https://issues.apache.org/jira/browse/HDFS-1073?focusedCommentId=13064221&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13064221 Merge trunk into HDFS-1073 HDFS-2135 . Fix regression of HDFS-1955 in HDFS-1073 branch. Contributed by Todd Lipcon. HDFS-2133 . Address remaining TODOs and pre-merge cleanup on HDFS-1073 branch. Contributed by Todd Lipcon. Amend HDFS-2011 for HDFS-1073 branch. Update test cases for new behavior of EditLogFileOutputStream. Contributed by Todd Lipcon and Eli Collins. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1146889 Files : /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileInputStream.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogOutputStream.java todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1146881 Files : /hadoop/common/branches/ HDFS-1073 /hdfs/src/webapps/datanode /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/Host2NodesMap.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSShell.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestHost2NodesMap.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/contrib/hdfsproxy /hadoop/common/branches/ HDFS-1073 /hdfs/CHANGES.txt /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/DFSInputStream.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/DecommissionManager.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataXceiverServer.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestWriteRead.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/DFSOutputStream.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/UpgradeObjectDatanode.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/Host2NodesMap.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailureToleration.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/webapps/hdfs /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileOutputStream.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogFileOutputStream.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/webapps/secondary /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceScanner.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/DFSClient.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java /hadoop/common/branches/ HDFS-1073 /hdfs /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/DecommissionManager.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/c++/libhdfs /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/datanode/TestDatanodeRegister.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/blockmanagement/TestHost2NodesMap.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/common/IncorrectVersionException.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/cli/testHDFSConf.xml todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1146864 Files : /hadoop/common/branches/ HDFS-1073 /hdfs/CHANGES. HDFS-1073 .txt /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1146856 Files : /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImageTransactionalStorageInspector.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/JournalManager.java /hadoop/common/branches/ HDFS-1073 /hdfs/CHANGES. HDFS-1073 .txt /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/CheckpointCommand.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileOutputStream.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/NamenodeRegistration.java todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1146848 Files : /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestCheckpoint.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogFileOutputStream.java
          Hide
          Ivan Kelly added a comment -

          LogHeader has a read method but not a write. Will it make sense to encapsulate both read and write of the header in the same class?
          Agreed - Ivan has opened HDFS-2149 - I'd propose we do that under that JIRA?

          HDFS-2149 will probably remove LogHeader completely. I plan to add a getVersion() call to InputStreams and each stream will handle it's own metadata internally. So EditLogFileInputStream will read it's version on creation, or first call to read etc. The input and output stream will be packet based, so an input stream is basically an iterator over FSEditLogOp objects and output stream is a sink for FSEditLogOp objects. I think the way I've implemented the FSEditLogOp objects should avoid all extra copies and object creation. Whats more, there's plenty to room to improve this by removing the creation of ArrayWritables and DeprecatedUTF8 objects and just write strings and arrays directly.

          Show
          Ivan Kelly added a comment - LogHeader has a read method but not a write. Will it make sense to encapsulate both read and write of the header in the same class? Agreed - Ivan has opened HDFS-2149 - I'd propose we do that under that JIRA? HDFS-2149 will probably remove LogHeader completely. I plan to add a getVersion() call to InputStreams and each stream will handle it's own metadata internally. So EditLogFileInputStream will read it's version on creation, or first call to read etc. The input and output stream will be packet based, so an input stream is basically an iterator over FSEditLogOp objects and output stream is a sink for FSEditLogOp objects. I think the way I've implemented the FSEditLogOp objects should avoid all extra copies and object creation. Whats more, there's plenty to room to improve this by removing the creation of ArrayWritables and DeprecatedUTF8 objects and just write strings and arrays directly.
          Hide
          Jitendra Nath Pandey added a comment -

          > Ivan has opened HDFS-2149 - I'd propose we do that under that JIRA?
          Sounds good.

          A few more minor comments:
          TestEditLog.java#testSimpleEditLog: Exception in the cluster.shutdown is being ignored.
          TestEditLog.java#testFailedOpen is disabled.

          TestBackupNode.java : waitCheckpointDone does nothing.

          Commented out code in a few places with TODOs.

          Show
          Jitendra Nath Pandey added a comment - > Ivan has opened HDFS-2149 - I'd propose we do that under that JIRA? Sounds good. A few more minor comments: TestEditLog.java#testSimpleEditLog: Exception in the cluster.shutdown is being ignored. TestEditLog.java#testFailedOpen is disabled. TestBackupNode.java : waitCheckpointDone does nothing. Commented out code in a few places with TODOs.
          Hide
          Todd Lipcon added a comment -

          good catches in TestEditLog.

          Are you sure you're looking at the latest version of the branch, regarding TestBackupNode? That function was filled in by HDFS-1979 which I committed yesterday morning I believe.

          I'll do another sweep for TODOs I might have missed.

          Show
          Todd Lipcon added a comment - good catches in TestEditLog. Are you sure you're looking at the latest version of the branch, regarding TestBackupNode? That function was filled in by HDFS-1979 which I committed yesterday morning I believe. I'll do another sweep for TODOs I might have missed.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-1073-branch #14 (See https://builds.apache.org/job/Hadoop-Hdfs-1073-branch/14/)
          HDFS-2160. Fix CreateEditsLog test tool in HDFS-1073 branch. Contributed by Todd Lipcon.

          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1148070
          Files :

          • /hadoop/common/branches/HDFS-1073/hdfs/bin/hdfs
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/CreateEditsLog.java
          • /hadoop/common/branches/HDFS-1073/hdfs/CHANGES.HDFS-1073.txt
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-1073-branch #14 (See https://builds.apache.org/job/Hadoop-Hdfs-1073-branch/14/ ) HDFS-2160 . Fix CreateEditsLog test tool in HDFS-1073 branch. Contributed by Todd Lipcon. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1148070 Files : /hadoop/common/branches/ HDFS-1073 /hdfs/bin/hdfs /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/CreateEditsLog.java /hadoop/common/branches/ HDFS-1073 /hdfs/CHANGES. HDFS-1073 .txt
          Hide
          Todd Lipcon added a comment -

          Hi Jitendra. I've addressed your feedback above:

          TestEditLog.java#testSimpleEditLog: Exception in the cluster.shutdown is being ignored.

          Committed a trivial fix in r1148480

          TestEditLog.java#testFailedOpen is disabled.

          Addressed by HDFS-2168

          Commented out code in a few places with TODOs.

          Fixed in HDFS-2170, HDFS-2169, and HDFS-2160

          Please let me know if you have more feedback. I agree that 2018 will continue to improve the code, but as discussed on the list I think we should merge this, and then take care of 2018 and 1580, and do a second merge.

          Show
          Todd Lipcon added a comment - Hi Jitendra. I've addressed your feedback above: TestEditLog.java#testSimpleEditLog: Exception in the cluster.shutdown is being ignored. Committed a trivial fix in r1148480 TestEditLog.java#testFailedOpen is disabled. Addressed by HDFS-2168 Commented out code in a few places with TODOs. Fixed in HDFS-2170 , HDFS-2169 , and HDFS-2160 Please let me know if you have more feedback. I agree that 2018 will continue to improve the code, but as discussed on the list I think we should merge this, and then take care of 2018 and 1580, and do a second merge.
          Hide
          Todd Lipcon added a comment -

          Here's a candidate merge patch, including HDFS-2168, HDFS-2169, HDFS-2170, HDFS-2172 which are currently out for review.

          Just wanted to get the patch submission train going.

          This will generate one RAT warning due to CHANGES.HDFS-1073.txt. What's the best way to integrate the changelist into CHANGES.txt? Should I dump the entire list in, or just a single entry for HDFS-1073? Or perhaps a single entry for HDFS-1073 and then a section lower in the same CHANGES.txt file that itemizes it?

          Show
          Todd Lipcon added a comment - Here's a candidate merge patch, including HDFS-2168 , HDFS-2169 , HDFS-2170 , HDFS-2172 which are currently out for review. Just wanted to get the patch submission train going. This will generate one RAT warning due to CHANGES. HDFS-1073 .txt. What's the best way to integrate the changelist into CHANGES.txt? Should I dump the entire list in, or just a single entry for HDFS-1073 ? Or perhaps a single entry for HDFS-1073 and then a section lower in the same CHANGES.txt file that itemizes it?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12487071/hdfs-1073-merge.patch
          against trunk revision 1148348.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 134 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 javac. The applied patch generated 34 javac compiler warnings (more than the trunk's current 23 warnings).

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          -1 release audit. The applied patch generated 1 release audit warnings (more than the trunk's current 0 warnings).

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.hdfs.server.namenode.TestEditLogFileOutputStream
          org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/972//testReport/
          Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/972//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/972//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/972//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12487071/hdfs-1073-merge.patch against trunk revision 1148348. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 134 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 34 javac compiler warnings (more than the trunk's current 23 warnings). +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. -1 release audit. The applied patch generated 1 release audit warnings (more than the trunk's current 0 warnings). -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.server.namenode.TestEditLogFileOutputStream org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/972//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/972//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/972//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/972//console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          -1 javac. The applied patch generated 34 javac compiler warnings (more than the trunk's current 23 warnings).

          This is due to many new test cases that use SecondaryNameNode (all the new warnings are just this deprecation warning). I'd like to consider undeprecating it, since it's very well tested and I think we still intend to recommend its use in 0.23.

          -1 release audit. The applied patch generated 1 release audit warnings (more than the trunk's current 0 warnings).

          This is the CHANGES.HDFS-1073.txt file. Please see above question – how should we integrate all of the subtask changelog items in the main CHANGES.txt?

          org.apache.hadoop.hdfs.server.namenode.TestEditLogFileOutputStream

          This passes on my box. I'll try to log in to the Hudson servers to see what's up.

          org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer

          This is failing because it requires a new binary file to be committed. It passes on the branch where the file is committed – just not represented in the patch.

          Show
          Todd Lipcon added a comment - -1 javac. The applied patch generated 34 javac compiler warnings (more than the trunk's current 23 warnings). This is due to many new test cases that use SecondaryNameNode (all the new warnings are just this deprecation warning). I'd like to consider undeprecating it, since it's very well tested and I think we still intend to recommend its use in 0.23. -1 release audit. The applied patch generated 1 release audit warnings (more than the trunk's current 0 warnings). This is the CHANGES. HDFS-1073 .txt file. Please see above question – how should we integrate all of the subtask changelog items in the main CHANGES.txt? org.apache.hadoop.hdfs.server.namenode.TestEditLogFileOutputStream This passes on my box. I'll try to log in to the Hudson servers to see what's up. org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer This is failing because it requires a new binary file to be committed. It passes on the branch where the file is committed – just not represented in the patch.
          Hide
          Todd Lipcon added a comment -

          TestEditLogFileOutputStream was failing because it depended on the exact byte length of a mkdirs op. I'd written it with username 'todd', whereas the test ran with username 'hudson' - hence the mkdirs had a longer username and took 2 more bytes when running in the build. I just committed a small change (r1148591) to the test to only verify that the log length increased, rather than some specific length.

          Show
          Todd Lipcon added a comment - TestEditLogFileOutputStream was failing because it depended on the exact byte length of a mkdirs op. I'd written it with username 'todd', whereas the test ran with username 'hudson' - hence the mkdirs had a longer username and took 2 more bytes when running in the build. I just committed a small change (r1148591) to the test to only verify that the log length increased, rather than some specific length.
          Hide
          Todd Lipcon added a comment -

          All of the patches are committed now. Please review this patch which I believe is ready to merge.

          We will then do a second merge in a couple of weeks to pull in Ivan and Jitendra's nice cleanup of the journaling interfaces.

          Show
          Todd Lipcon added a comment - All of the patches are committed now. Please review this patch which I believe is ready to merge. We will then do a second merge in a couple of weeks to pull in Ivan and Jitendra's nice cleanup of the journaling interfaces.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12487094/hdfs-1073-merge.patch
          against trunk revision 1148348.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 134 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 javac. The applied patch generated 34 javac compiler warnings (more than the trunk's current 23 warnings).

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          -1 release audit. The applied patch generated 1 release audit warnings (more than the trunk's current 0 warnings).

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/973//testReport/
          Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/973//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/973//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/973//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12487094/hdfs-1073-merge.patch against trunk revision 1148348. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 134 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 34 javac compiler warnings (more than the trunk's current 23 warnings). +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. -1 release audit. The applied patch generated 1 release audit warnings (more than the trunk's current 0 warnings). -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/973//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/973//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/973//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/973//console This message is automatically generated.
          Hide
          Suresh Srinivas added a comment -

          > This will generate one RAT warning due to CHANGES.HDFS-1073.txt. What's the best way to integrate the changelist into CHANGES.txt? Should I dump the entire list in, or just a single entry for HDFS-1073? Or perhaps a single entry for HDFS-1073 and then a section lower in the same CHANGES.txt file that itemizes it?

          In federation I dumped the entire list into CHANGES.txt, with Federation: tag in front of each change.

          Our Changes.txt protocol is woefully inadequate. Recording trivial jiras in CHANGES.txt dilutes its value for people looking for important changes that is part of a release. Given that every thing gets recorded in it, I decided to add all the entries.

          On a separate note, we should rethink our policy of adding every change to CHANGES.txt. At least we should consider adding tags: trivial, minor, major, critical, incremental for easier consumption.

          Show
          Suresh Srinivas added a comment - > This will generate one RAT warning due to CHANGES. HDFS-1073 .txt. What's the best way to integrate the changelist into CHANGES.txt? Should I dump the entire list in, or just a single entry for HDFS-1073 ? Or perhaps a single entry for HDFS-1073 and then a section lower in the same CHANGES.txt file that itemizes it? In federation I dumped the entire list into CHANGES.txt, with Federation: tag in front of each change. Our Changes.txt protocol is woefully inadequate. Recording trivial jiras in CHANGES.txt dilutes its value for people looking for important changes that is part of a release. Given that every thing gets recorded in it, I decided to add all the entries. On a separate note, we should rethink our policy of adding every change to CHANGES.txt. At least we should consider adding tags: trivial, minor, major, critical, incremental for easier consumption.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-1073-branch #15 (See https://builds.apache.org/job/Hadoop-Hdfs-1073-branch/15/)
          HDFS-2172. Address findbugs and javadoc warnings in HDFS-1073 branch. Contributed by Todd Lipcon.
          HDFS-2170. Address remaining TODOs in HDFS-1073 branch. Contributed by Todd Lipcon.
          Merge trunk into HDFS-1073

          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1148592
          Files :

          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImageTransactionalStorageInspector.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/findbugsExcludeFile.xml
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/Checkpointer.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorageArchivalManager.java
          • /hadoop/common/branches/HDFS-1073/hdfs/CHANGES.HDFS-1073.txt
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/GetImageServlet.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogBackupOutputStream.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/util/AtomicFileOutputStream.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/util/MD5FileUtils.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java

          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1148589
          Files :

          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/FSImageTestUtil.java
          • /hadoop/common/branches/HDFS-1073/hdfs/CHANGES.HDFS-1073.txt
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestSaveNamespace.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogRace.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestParallelImageWrite.java

          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1148533
          Files :

          • /hadoop/common/branches/HDFS-1073/hdfs/src/webapps/datanode
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSShell.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestWriteConfigurationToDFS.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/unit/org/apache/hadoop/hdfs/server/namenode/TestNNLeaseRecovery.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/contrib/hdfsproxy
          • /hadoop/common/branches/HDFS-1073/hdfs/CHANGES.txt
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BackupNode.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/DFSInputStream.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestAbandonBlock.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/security/token/block/TestBlockToken.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogRace.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodeMetrics.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/DFSUtil.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/ClusterJspHelper.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestLeaseRecovery.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/common/HdfsConstants.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/BlocksMap.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/DfsServlet.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/OfflineEditsViewerHelper.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/Host2NodesMap.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/datanode/TestInterDatanodeProtocol.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestSaveNamespace.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DirectoryScanner.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/HftpFileSystem.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/datanode/TestTransferRbw.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/webapps/hdfs
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/DFSClientAdapter.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/webapps/secondary
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/LeaseRenewer.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceScanner.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/common/JspHelper.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/DFSClient.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FsckServlet.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/security/token/delegation/DelegationTokenIdentifier.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/DFSTestUtil.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/DFSClientAdapter.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestBlockUnderConstruction.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NamenodeJspHelper.java
          • /hadoop/common/branches/HDFS-1073/hdfs
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestReplication.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/c++/libhdfs
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FileDataServlet.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FileChecksumServlets.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-1073-branch #15 (See https://builds.apache.org/job/Hadoop-Hdfs-1073-branch/15/ ) HDFS-2172 . Address findbugs and javadoc warnings in HDFS-1073 branch. Contributed by Todd Lipcon. HDFS-2170 . Address remaining TODOs in HDFS-1073 branch. Contributed by Todd Lipcon. Merge trunk into HDFS-1073 todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1148592 Files : /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImageTransactionalStorageInspector.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/findbugsExcludeFile.xml /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/Checkpointer.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorageArchivalManager.java /hadoop/common/branches/ HDFS-1073 /hdfs/CHANGES. HDFS-1073 .txt /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/GetImageServlet.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogBackupOutputStream.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/util/AtomicFileOutputStream.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/util/MD5FileUtils.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1148589 Files : /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/FSImageTestUtil.java /hadoop/common/branches/ HDFS-1073 /hdfs/CHANGES. HDFS-1073 .txt /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestSaveNamespace.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogRace.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestParallelImageWrite.java todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1148533 Files : /hadoop/common/branches/ HDFS-1073 /hdfs/src/webapps/datanode /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSShell.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestWriteConfigurationToDFS.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/unit/org/apache/hadoop/hdfs/server/namenode/TestNNLeaseRecovery.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/contrib/hdfsproxy /hadoop/common/branches/ HDFS-1073 /hdfs/CHANGES.txt /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BackupNode.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/DFSInputStream.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestAbandonBlock.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/security/token/block/TestBlockToken.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogRace.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodeMetrics.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/DFSUtil.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/ClusterJspHelper.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestLeaseRecovery.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/common/HdfsConstants.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/BlocksMap.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/DfsServlet.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/OfflineEditsViewerHelper.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/Host2NodesMap.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/blockmanagement/TestUnderReplicatedBlocks.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/datanode/TestInterDatanodeProtocol.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestSaveNamespace.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DirectoryScanner.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/HftpFileSystem.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/datanode/TestTransferRbw.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/webapps/hdfs /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/DFSClientAdapter.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/webapps/secondary /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/LeaseRenewer.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceScanner.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/common/JspHelper.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/DFSClient.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FsckServlet.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/security/token/delegation/DelegationTokenIdentifier.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/DFSTestUtil.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/DFSClientAdapter.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestBlockUnderConstruction.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NamenodeJspHelper.java /hadoop/common/branches/ HDFS-1073 /hdfs /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/DistributedFileSystem.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestReplication.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/c++/libhdfs /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FileDataServlet.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FileChecksumServlets.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
          Hide
          Konstantin Shvachko added a comment -

          Todd could you please post your benchmark results. I usually run NNThroughputBenchmark with a variety of threads from 100 to 1000. You can choose whatever # is optimal when you post the results.
          We are mostly interested in operations like create, rename, delete, which update edits. But we also should verify that open and blockReport are the same as before as they are not edits related.
          I plan to review the changes with a focus on BN part.

          Show
          Konstantin Shvachko added a comment - Todd could you please post your benchmark results. I usually run NNThroughputBenchmark with a variety of threads from 100 to 1000. You can choose whatever # is optimal when you post the results. We are mostly interested in operations like create, rename, delete, which update edits. But we also should verify that open and blockReport are the same as before as they are not edits related. I plan to review the changes with a focus on BN part.
          Hide
          Todd Lipcon added a comment -

          Any preference whether the benchmark results are from a machine with SSD vs not? How much heap size do you typically configure for NNThroughputBenchmark?

          Show
          Todd Lipcon added a comment - Any preference whether the benchmark results are from a machine with SSD vs not? How much heap size do you typically configure for NNThroughputBenchmark?
          Hide
          Todd Lipcon added a comment -

          I ran NNThroughputBenchmark with 100 threads, 100000 ops, three runs with trunk and three runs with 1073. The machine is Xeon E5540 at 2.53GHz, 8 cores w/ HT enabled. The edits disk is a single local SATA 7200rpm.

          Here are the mean ops/sec for the various mutations:

          op trunk 1073
          create 5060 4993
          open 28120 28950
          delete 5552 5468
          rename 5455 5451

          Looking at the FSEditLog log, the mean numbers are:

          stat trunk 1073
          Total time for txns 6926 6822
          txns batched 1077882 1077876
          number of syncs 22217 22226
          SyncTimes 202537 204165

          To summarize, there is a small hit on the write operations, since they now log more data. This also shows up in the higher SyncTimes. The read ops are unaffected (open actually benchmarked faster in 1073, but it had fairly high variance)

          Show
          Todd Lipcon added a comment - I ran NNThroughputBenchmark with 100 threads, 100000 ops, three runs with trunk and three runs with 1073. The machine is Xeon E5540 at 2.53GHz, 8 cores w/ HT enabled. The edits disk is a single local SATA 7200rpm. Here are the mean ops/sec for the various mutations: op trunk 1073 create 5060 4993 open 28120 28950 delete 5552 5468 rename 5455 5451 Looking at the FSEditLog log, the mean numbers are: stat trunk 1073 Total time for txns 6926 6822 txns batched 1077882 1077876 number of syncs 22217 22226 SyncTimes 202537 204165 To summarize, there is a small hit on the write operations, since they now log more data. This also shows up in the higher SyncTimes. The read ops are unaffected (open actually benchmarked faster in 1073, but it had fairly high variance)
          Hide
          Todd Lipcon added a comment -

          Replying to Suresh above regarding CHANGES.txt:

          I agree that our current protocol isn't great. The problem with the way federation was done is that, although federation is a New Feature, many of the subcomponents were just bug fixes against the federation branch.

          Would it be OK with you if I did the following?

          • add in the NEW FEATURES section: "HDFS-1073. Redesign the NameNode's storage layout for image checkpoints and edit logs to introduce transaction IDs and be more robust. Please see HDFS-1073 section below for breakout of individual patches."
          • add a new section called "BREAKOUT OF HDFS-1073 Subtasks" with the contents of CHANGES.HDFS-1073.txt
          Show
          Todd Lipcon added a comment - Replying to Suresh above regarding CHANGES.txt: I agree that our current protocol isn't great. The problem with the way federation was done is that, although federation is a New Feature, many of the subcomponents were just bug fixes against the federation branch. Would it be OK with you if I did the following? add in the NEW FEATURES section: " HDFS-1073 . Redesign the NameNode's storage layout for image checkpoints and edit logs to introduce transaction IDs and be more robust. Please see HDFS-1073 section below for breakout of individual patches." add a new section called "BREAKOUT OF HDFS-1073 Subtasks" with the contents of CHANGES. HDFS-1073 .txt
          Hide
          Konstantin Shvachko added a comment -

          SSD isn't practical now. Heap should fit the namespace.
          Thanks. This looks good to me.

          Show
          Konstantin Shvachko added a comment - SSD isn't practical now. Heap should fit the namespace. Thanks. This looks good to me.
          Hide
          Todd Lipcon added a comment -

          This looks good to me

          To be clear, do you mean the benchmark results, or the merge?

          Show
          Todd Lipcon added a comment - This looks good to me To be clear, do you mean the benchmark results, or the merge?
          Hide
          Konstantin Shvachko added a comment -

          The benchmarks look good to me.

          Show
          Konstantin Shvachko added a comment - The benchmarks look good to me.
          Hide
          Konstantin Shvachko added a comment -

          Did a quick sweep over warnings:

          1. Remove unused imports: CheckpointSignature, EditLogFileOutputStream, FSEditLogLoader, TransferFsImage, NamenodeProtocol, NamenodeRegistration, NamespaceInfo, EditsLoaderCurrent, ImageLoaderCurrent.
          2. Unused code: EditLogBackupInputStream.ByteBufferInputStream.getData()
          3. Move SecondaryNameNode.rollForwardByApplyingLogs() into Checkpointer to avoid deprecation warning and to ease removal of SNN in the future.

          Looking further.

          Show
          Konstantin Shvachko added a comment - Did a quick sweep over warnings: Remove unused imports: CheckpointSignature, EditLogFileOutputStream, FSEditLogLoader, TransferFsImage, NamenodeProtocol, NamenodeRegistration, NamespaceInfo, EditsLoaderCurrent, ImageLoaderCurrent. Unused code: EditLogBackupInputStream.ByteBufferInputStream.getData() Move SecondaryNameNode.rollForwardByApplyingLogs() into Checkpointer to avoid deprecation warning and to ease removal of SNN in the future. Looking further.
          Hide
          Todd Lipcon added a comment -

          Thanks for the comments. I did another sweep for unused imports and moved rollForwardByApplyingLogs like you suggested. I committed these changes to the branch.

          Show
          Todd Lipcon added a comment - Thanks for the comments. I did another sweep for unused imports and moved rollForwardByApplyingLogs like you suggested. I committed these changes to the branch.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-1073-branch #19 (See https://builds.apache.org/job/Hadoop-Hdfs-1073-branch/19/)
          Move SecondaryNameNode.rollForwardByApplyingEdits to Checkpointer. Remove unused code in EditLogBackupInputStream

          In response to Konstantin's review at:
          https://issues.apache.org/jira/browse/HDFS-1073?focusedCommentId=13070021&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13070021

          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1150241
          Files :

          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/Checkpointer.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogBackupInputStream.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-1073-branch #19 (See https://builds.apache.org/job/Hadoop-Hdfs-1073-branch/19/ ) Move SecondaryNameNode.rollForwardByApplyingEdits to Checkpointer. Remove unused code in EditLogBackupInputStream In response to Konstantin's review at: https://issues.apache.org/jira/browse/HDFS-1073?focusedCommentId=13070021&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13070021 todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1150241 Files : /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/Checkpointer.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogBackupInputStream.java
          Hide
          Konstantin Shvachko added a comment -

          Looked at the checkpoint code. Comments:

          1. EditLogOutputStream does not extend OutputStream, what is the reason for that?
            If it does not, then there is nowhere to inherit javaDoc from, which it does.
          2. BackupImage.BNState - convert field descriptions to JavaDoc.
          3. BackupNodeProtocol would it be better to call it JournalProtocol?
          4. In FSEditLogOpCodes comment below refers to a non existing entity. Probably redundant.
            {{ // must be same as NamenodeProtocol.JA_JSPOOL_START}}
          5. JournalManager, BackupJournalManager, FileJournalManager should not be public. BackupJournalManager needs JavaDoc.
          6. FSImageOldStorageInspector: "Old" is not informative. Could be something like PreTransactional or Plain or something.
          7. Good design doc, but somewhat outdated. Do you plan to update it some time?
          8. FSEditLog.JournalAndStream should not have public methods if possible.
          9. I propose to wrap the part of doCheckpoint() that transfers image and edits from NN in downloadCheckpoint() method to make the former more readable.
            Also I recommend first downloading all files (image and edits), then applying them to memory. Now you do: download image, apply image, download edits, apply edits. Should be: download, download, apply, apply. That way it will fail fast if download is not successful.
          10. When a stream gets bad, we should force syncing remaining journal streams, don't we? Otherwise there is no way to distinguish between failed streams and the valid ones. Or did I miss something?
          11. TransferFsImage should send/receive CheckpointSignature as a parameter to make sure that requests belong to the valid checkpoint. If it is hard to do it in this jira, let's open a new one (if not opened already).

          Will look at journalling next.

          Show
          Konstantin Shvachko added a comment - Looked at the checkpoint code. Comments: EditLogOutputStream does not extend OutputStream, what is the reason for that? If it does not, then there is nowhere to inherit javaDoc from, which it does. BackupImage.BNState - convert field descriptions to JavaDoc. BackupNodeProtocol would it be better to call it JournalProtocol? In FSEditLogOpCodes comment below refers to a non existing entity. Probably redundant. {{ // must be same as NamenodeProtocol.JA_JSPOOL_START}} JournalManager, BackupJournalManager, FileJournalManager should not be public. BackupJournalManager needs JavaDoc. FSImageOldStorageInspector: "Old" is not informative. Could be something like PreTransactional or Plain or something. Good design doc, but somewhat outdated. Do you plan to update it some time? FSEditLog.JournalAndStream should not have public methods if possible. I propose to wrap the part of doCheckpoint() that transfers image and edits from NN in downloadCheckpoint() method to make the former more readable. Also I recommend first downloading all files (image and edits), then applying them to memory. Now you do: download image, apply image, download edits, apply edits. Should be: download, download, apply, apply. That way it will fail fast if download is not successful. When a stream gets bad, we should force syncing remaining journal streams, don't we? Otherwise there is no way to distinguish between failed streams and the valid ones. Or did I miss something? TransferFsImage should send/receive CheckpointSignature as a parameter to make sure that requests belong to the valid checkpoint. If it is hard to do it in this jira, let's open a new one (if not opened already). Will look at journalling next.
          Hide
          Matt Foley added a comment -

          I like the new model, and I like the way upgrade is handled.

          In NNStorageArchivalManager and JournalManager:

          I have difficulty with naming a method or interface "archiver" and then implementing it as "delete". That seems an incorrect abstraction. How about changing the name of the i/f StorageArchiver to "StorageDisposition" with methods "disposeLog" and "disposeImage"? Then it could have implementations DeletionStorageDisposition and ArchiverStorageDisposition, without prejudice. Similarly in JournalManager, the method archiveLogsOlderThan() could be renamed disposeLogsOlderThan().

          I'm not set on the specific word choice "disposition" or "dispose", but I think that if an allowable implementation of an interface is deletion, then it shouldn't be named "Archive".

          Show
          Matt Foley added a comment - I like the new model, and I like the way upgrade is handled. In NNStorageArchivalManager and JournalManager: I have difficulty with naming a method or interface "archiver" and then implementing it as "delete". That seems an incorrect abstraction. How about changing the name of the i/f StorageArchiver to "StorageDisposition" with methods "disposeLog" and "disposeImage"? Then it could have implementations DeletionStorageDisposition and ArchiverStorageDisposition, without prejudice. Similarly in JournalManager, the method archiveLogsOlderThan() could be renamed disposeLogsOlderThan(). I'm not set on the specific word choice "disposition" or "dispose", but I think that if an allowable implementation of an interface is deletion, then it shouldn't be named "Archive".
          Hide
          Ivan Kelly added a comment -

          Similarly in JournalManager, the method archiveLogsOlderThan() could be renamed disposeLogsOlderThan().

          HDFS-2018 renames this to

          void purgeTransactions(long minTxIdToKeep) throws IOException;
          

          to match the design doc for HDFS-1580

          Show
          Ivan Kelly added a comment - Similarly in JournalManager, the method archiveLogsOlderThan() could be renamed disposeLogsOlderThan(). HDFS-2018 renames this to void purgeTransactions( long minTxIdToKeep) throws IOException; to match the design doc for HDFS-1580
          Hide
          Todd Lipcon added a comment -

          Committed a number of small fixes to the branch to address the following from Konstantin's review:

          EditLogOutputStream does not extend OutputStream, what is the reason for that?

          EditLogOutputStream is moving towards being more like a "sink for journal records" rather than a straight output stream. That is to say, the abstraction is a sequence of edits, rather than a sequence of bytes. So extending OutputStream wasn't really buying us anything.

          Good point about the javadoc inheritance. I added JavaDoc now that it it is its own class. I also removed the write(int) API which was only used internally.

          BackupImage.BNState - convert field descriptions to JavaDoc.

          Fixed.

          BackupNodeProtocol would it be better to call it JournalProtocol

          Since it's only used for transferring edits to the BackupNode (and not any other type of journaling) I think the current name makes more sense. It's also more consistent with the other protocols like DatanodeProtocol and NamenodeProtocol.

          In FSEditLogOpCodes comment below refers to a non existing entity. Probably redundant...

          Fixed. I also noticed that OP_JSPOOL_START was referenced in the code in a few places, but no longer necessary. I removed it as well as the isOperationSupported() call which was no longer necessary.

          JournalManager, BackupJournalManager, FileJournalManager should not be public. BackupJournalManager needs JavaDoc.

          Fixed.

          FSImageOldStorageInspector: "Old" is not informative. Could be something like PreTransactional or Plain or something.

          Good idea. I renamed it to PreTransactional.

          Good design doc, but somewhat outdated. Do you plan to update it some time?

          I see the main purpose of the design doc to guide development, rather than to document the architecture after it's done. I will try to update any places where it's grossly inaccurate after we've merged this to trunk, though.

          FSEditLog.JournalAndStream should not have public methods if possible.

          Fixed.

          I propose to wrap the part of doCheckpoint() that transfers image and edits from NN in downloadCheckpoint() method to make the former more readable.
          Also I recommend first downloading all files (image and edits), then applying them to memory. Now you do: download image, apply image, download edits, apply edits. Should be: download, download, apply, apply. That way it will fail fast if download is not successful.

          I looked into doing this, but it wasn't straightforward, since we don't always need to download the image. Would it be alright to address this after the merge? I can file a JIRA so it doesn't get forgotten.

          When a stream gets bad, we should force syncing remaining journal streams, don't we? Otherwise there is no way to distinguish between failed streams and the valid ones. Or did I miss something?

          I'm not sure I follow. We only detect that a stream is bad when we're syncing, so we're syncing the other ones at the same time anyway.

          We know that stream is bad because JournalAndStream.isActive returns false after the stream is aborted. On disk, we can distinguish it from the non-failed streams since the non-failed streams will be longer during log recovery. See TestFSImageStorageInspector.testLogGroupRecoveryInProgress as well as some of the crash-recovery related test cases in TestFSEditLog.

          TransferFsImage should send/receive CheckpointSignature as a parameter to make sure that requests belong to the valid checkpoint

          We do validate that the request belongs to the correct namespace by passing the namespaceID/clusterID/etc. See GetImageServlet.java:92 and TestCheckpoint.testReformatNNBetweenCheckpoints.

          Show
          Todd Lipcon added a comment - Committed a number of small fixes to the branch to address the following from Konstantin's review: EditLogOutputStream does not extend OutputStream, what is the reason for that? EditLogOutputStream is moving towards being more like a "sink for journal records" rather than a straight output stream. That is to say, the abstraction is a sequence of edits, rather than a sequence of bytes. So extending OutputStream wasn't really buying us anything. Good point about the javadoc inheritance. I added JavaDoc now that it it is its own class. I also removed the write(int) API which was only used internally. BackupImage.BNState - convert field descriptions to JavaDoc. Fixed. BackupNodeProtocol would it be better to call it JournalProtocol Since it's only used for transferring edits to the BackupNode (and not any other type of journaling) I think the current name makes more sense. It's also more consistent with the other protocols like DatanodeProtocol and NamenodeProtocol. In FSEditLogOpCodes comment below refers to a non existing entity. Probably redundant... Fixed. I also noticed that OP_JSPOOL_START was referenced in the code in a few places, but no longer necessary. I removed it as well as the isOperationSupported() call which was no longer necessary. JournalManager, BackupJournalManager, FileJournalManager should not be public. BackupJournalManager needs JavaDoc. Fixed. FSImageOldStorageInspector: "Old" is not informative. Could be something like PreTransactional or Plain or something. Good idea. I renamed it to PreTransactional. Good design doc, but somewhat outdated. Do you plan to update it some time? I see the main purpose of the design doc to guide development, rather than to document the architecture after it's done. I will try to update any places where it's grossly inaccurate after we've merged this to trunk, though. FSEditLog.JournalAndStream should not have public methods if possible. Fixed. I propose to wrap the part of doCheckpoint() that transfers image and edits from NN in downloadCheckpoint() method to make the former more readable. Also I recommend first downloading all files (image and edits), then applying them to memory. Now you do: download image, apply image, download edits, apply edits. Should be: download, download, apply, apply. That way it will fail fast if download is not successful. I looked into doing this, but it wasn't straightforward, since we don't always need to download the image. Would it be alright to address this after the merge? I can file a JIRA so it doesn't get forgotten. When a stream gets bad, we should force syncing remaining journal streams, don't we? Otherwise there is no way to distinguish between failed streams and the valid ones. Or did I miss something? I'm not sure I follow. We only detect that a stream is bad when we're syncing, so we're syncing the other ones at the same time anyway. We know that stream is bad because JournalAndStream.isActive returns false after the stream is aborted. On disk, we can distinguish it from the non-failed streams since the non-failed streams will be longer during log recovery. See TestFSImageStorageInspector.testLogGroupRecoveryInProgress as well as some of the crash-recovery related test cases in TestFSEditLog . TransferFsImage should send/receive CheckpointSignature as a parameter to make sure that requests belong to the valid checkpoint We do validate that the request belongs to the correct namespace by passing the namespaceID/clusterID/etc. See GetImageServlet.java:92 and TestCheckpoint.testReformatNNBetweenCheckpoints .
          Hide
          Todd Lipcon added a comment -

          Matt: How about changing it to StoragePurger, etc, as Ivan suggests?

          Show
          Todd Lipcon added a comment - Matt: How about changing it to StoragePurger, etc, as Ivan suggests?
          Hide
          Matt Foley added a comment -

          Yes, that would be okay. Thanks.

          Show
          Matt Foley added a comment - Yes, that would be okay. Thanks.
          Hide
          Todd Lipcon added a comment -

          OK. I committed the rename to the branch in r1151192.

          Show
          Todd Lipcon added a comment - OK. I committed the rename to the branch in r1151192.
          Hide
          Konstantin Shvachko added a comment -

          Checked the journalling code. Generally it looks good. Some comments:

          > Since it's only used for transferring edits ...

          Exactly the point. How you extracted it its only used for journaling. It is different from DatanodeProtocol as it does not have e.g. register(), so it is not a complete BackupNode protocol. Also the same journal protocol can be used in StandbyNode, and it would be then confusing to have BackupNode in the name.
          So I would either rename it to JournalProtocol or would not factor this protocol out at all: keep methods inside NameNodeProtocol. The latter makes sense for me if as you proposed different NameNodes will be the same entity with different roles or states.

          > I see the main purpose of the design doc to guide development, rather than to document the architecture after it's done.

          Document the architecture is important as it is the proof of correctness of the approach. Also I bet in a couple of months you will not remember all the details and will need that design doc to refresh your mind. I will.

          > On disk, we can distinguish it from the non-failed streams since the non-failed streams will be longer

          I hope "longer" does not mean file length? Besides that, your explanation seems reasonable.
          Do you attempt to restore bad streams on rollEdits() as done by attemptRestoreRemovedStorage() in current implementation?

          > I removed it as well as the isOperationSupported()

          Good point. Just noticed the same thing.
          You should also be able to eliminate OP_JSPOOL_START as you don't have FSEditLog.logJSpoolStart() method anymore.

          Having said that, how do you determine when to roll edits on the BackupNode without logJSpoolStart()?
          Explaining. In current implementation OP_JSPOOL_START is sent as a part of the journal stream, so BN knows exactly after which transaction the edits should be rolled. In your implementation logJSpoolStart() is replaced by startLogSegment(segmentTxId). Is it possible in your implementation that
          a) BN already processed transactions with higher id than segmentTxId
          b) BN hasn't seen yet transaction preceding segmentTxId
          According to Precondition this should not be possible. What guarantees that?

          Show
          Konstantin Shvachko added a comment - Checked the journalling code. Generally it looks good. Some comments: > Since it's only used for transferring edits ... Exactly the point. How you extracted it its only used for journaling. It is different from DatanodeProtocol as it does not have e.g. register(), so it is not a complete BackupNode protocol. Also the same journal protocol can be used in StandbyNode, and it would be then confusing to have BackupNode in the name. So I would either rename it to JournalProtocol or would not factor this protocol out at all: keep methods inside NameNodeProtocol. The latter makes sense for me if as you proposed different NameNodes will be the same entity with different roles or states. > I see the main purpose of the design doc to guide development, rather than to document the architecture after it's done. Document the architecture is important as it is the proof of correctness of the approach. Also I bet in a couple of months you will not remember all the details and will need that design doc to refresh your mind. I will. > On disk, we can distinguish it from the non-failed streams since the non-failed streams will be longer I hope "longer" does not mean file length? Besides that, your explanation seems reasonable. Do you attempt to restore bad streams on rollEdits() as done by attemptRestoreRemovedStorage() in current implementation? > I removed it as well as the isOperationSupported() Good point. Just noticed the same thing. You should also be able to eliminate OP_JSPOOL_START as you don't have FSEditLog.logJSpoolStart() method anymore. Having said that, how do you determine when to roll edits on the BackupNode without logJSpoolStart()? Explaining. In current implementation OP_JSPOOL_START is sent as a part of the journal stream, so BN knows exactly after which transaction the edits should be rolled. In your implementation logJSpoolStart() is replaced by startLogSegment(segmentTxId). Is it possible in your implementation that a) BN already processed transactions with higher id than segmentTxId b) BN hasn't seen yet transaction preceding segmentTxId According to Precondition this should not be possible. What guarantees that?
          Hide
          Todd Lipcon added a comment -

          I see your point now about the protocol naming. I'll changing it JournalProtocol.

          Document the architecture is important as it is the proof of correctness of the approach

          If writing documents about code guaranteed the code were correct, our jobs would be a lot easier, wouldn't they? But yes, I'll clean up the doc after merging to make sure there's nothing inaccurate.

          I hope "longer" does not mean file length?

          In the case that the only logs available starting at a given transaction ID are named edits_inprogress_N, then we read through them to determine the "valid length" – ie the number of valid transactions. A transaction is valid if it has a valid checksum, sequential transaction ID, etc. The one with the most valid transactions is chosen.

          So, extra 0s or FFs on the end of a file won't affect the "valid length".

          Do you attempt to restore bad streams on rollEdits() as done by attemptRestoreRemovedStorage() in current implementation?

          Yes – each JournalManager creats a new OutputStream object when edits are rolled.

          ...OP_JSPOOL_START...

          Yep, this is entirely eliminated now.

          Is it possible in your implementation that
          a) BN already processed transactions with higher id than segmentTxId
          b) BN hasn't seen yet transaction preceding segmentTxId
          According to Precondition this should not be possible. What guarantees that?

          Because all of the calls to the BN go through JournalManager, and all of the calls are synchronous, the ordering won't get interleaved. That is to say, when an edit log is rolled, the startLogSegment() RPC call must respond before the next transaction can be journaled. And, before calling startLogSegment(), the previous log segment is flushed, guaranteeing that all previous edits "made it".

          The Precondition is there just in case there's a bug that we missed – this way we'll get a BN crash rather than something worse like silent data loss.

          Show
          Todd Lipcon added a comment - I see your point now about the protocol naming. I'll changing it JournalProtocol. Document the architecture is important as it is the proof of correctness of the approach If writing documents about code guaranteed the code were correct, our jobs would be a lot easier, wouldn't they? But yes, I'll clean up the doc after merging to make sure there's nothing inaccurate. I hope "longer" does not mean file length? In the case that the only logs available starting at a given transaction ID are named edits_inprogress_N, then we read through them to determine the "valid length" – ie the number of valid transactions. A transaction is valid if it has a valid checksum, sequential transaction ID, etc. The one with the most valid transactions is chosen. So, extra 0s or FFs on the end of a file won't affect the "valid length". Do you attempt to restore bad streams on rollEdits() as done by attemptRestoreRemovedStorage() in current implementation? Yes – each JournalManager creats a new OutputStream object when edits are rolled. ...OP_JSPOOL_START... Yep, this is entirely eliminated now. Is it possible in your implementation that a) BN already processed transactions with higher id than segmentTxId b) BN hasn't seen yet transaction preceding segmentTxId According to Precondition this should not be possible. What guarantees that? Because all of the calls to the BN go through JournalManager, and all of the calls are synchronous, the ordering won't get interleaved. That is to say, when an edit log is rolled, the startLogSegment() RPC call must respond before the next transaction can be journaled. And, before calling startLogSegment(), the previous log segment is flushed, guaranteeing that all previous edits "made it". The Precondition is there just in case there's a bug that we missed – this way we'll get a BN crash rather than something worse like silent data loss.
          Hide
          Matt Foley added a comment -

          Perhaps the rename of "Archival" to "Purge" should include the class names of
          NNStorageArchivalManager, TestNNStorageArchivalManager, and TestNNStorageArchivalFunctional.

          Show
          Matt Foley added a comment - Perhaps the rename of "Archival" to "Purge" should include the class names of NNStorageArchivalManager, TestNNStorageArchivalManager, and TestNNStorageArchivalFunctional.
          Hide
          Todd Lipcon added a comment -

          My thinking was that the "archival policy" determines how things are kept vs archived vs deleted... maybe RetentionManager is better?

          Show
          Todd Lipcon added a comment - My thinking was that the "archival policy" determines how things are kept vs archived vs deleted... maybe RetentionManager is better?
          Hide
          Matt Foley added a comment -

          I would phrase it that the "purge policy" determines how long to retain the files online, and what to do with them after you no longer want to keep them online - archive, delete, or whatever.
          So calling it Retention Policy and RetentionManager would be fine.

          Show
          Matt Foley added a comment - I would phrase it that the "purge policy" determines how long to retain the files online, and what to do with them after you no longer want to keep them online - archive, delete, or whatever. So calling it Retention Policy and RetentionManager would be fine.
          Hide
          Todd Lipcon added a comment -

          Great. I'll rename to NNStorageRetentionManager or something of that sort - probably tomorrow morning, along with the protocol renaming as suggested by Konstantin.

          Show
          Todd Lipcon added a comment - Great. I'll rename to NNStorageRetentionManager or something of that sort - probably tomorrow morning, along with the protocol renaming as suggested by Konstantin.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-1073-branch #22 (See https://builds.apache.org/job/Hadoop-Hdfs-1073-branch/22/)
          Rename StorageArchiver to StoragePurger as suggested by Matt and Ivan in the comments on HDFS-1073

          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1151192
          Files :

          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BackupJournalManager.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/JournalManager.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorageArchivalManager.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/GetImageServlet.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestNNStorageArchivalManager.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestNNStorageArchivalFunctional.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-1073-branch #22 (See https://builds.apache.org/job/Hadoop-Hdfs-1073-branch/22/ ) Rename StorageArchiver to StoragePurger as suggested by Matt and Ivan in the comments on HDFS-1073 todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1151192 Files : /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BackupJournalManager.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/JournalManager.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorageArchivalManager.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/GetImageServlet.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestNNStorageArchivalManager.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestNNStorageArchivalFunctional.java
          Hide
          Todd Lipcon added a comment -

          I've renamed BackupNodeProtocol to JournalProtocol, and renamed NNStorageArchivalManager to NNStorageRetentionManager in the branch.

          Any further comments?

          Show
          Todd Lipcon added a comment - I've renamed BackupNodeProtocol to JournalProtocol, and renamed NNStorageArchivalManager to NNStorageRetentionManager in the branch. Any further comments?
          Hide
          Jitendra Nath Pandey added a comment -

          +1. I think the patch is in good shape and ready for merge.

          Show
          Jitendra Nath Pandey added a comment - +1. I think the patch is in good shape and ready for merge.
          Hide
          Matt Foley added a comment -

          +1. I have become confident that the merge should proceed.

          Show
          Matt Foley added a comment - +1. I have become confident that the merge should proceed.
          Hide
          Eli Collins added a comment -

          +1 Let's merge.

          Show
          Eli Collins added a comment - +1 Let's merge.
          Hide
          Ivan Kelly added a comment -

          +1
          I've got a whole load of patches waiting to go on top of this, so the sooner it goes in the better

          Show
          Ivan Kelly added a comment - +1 I've got a whole load of patches waiting to go on top of this, so the sooner it goes in the better
          Hide
          Todd Lipcon added a comment -

          Great, we now have +1s from the following committers: Jitendra, Eli, and Matt, plus an additional +1 from Ivan who has reviewed much of the code and is knowledgeable. So, this should be good to merge. If there is further review feedback I'll continue to address it in follow-up JIRAs.

          There is a bit of a conflict on the merge currently because of a couple patches that went into trunk. I will fix this and post a final merge patch this evening.

          Show
          Todd Lipcon added a comment - Great, we now have +1s from the following committers: Jitendra, Eli, and Matt, plus an additional +1 from Ivan who has reviewed much of the code and is knowledgeable. So, this should be good to merge. If there is further review feedback I'll continue to address it in follow-up JIRAs. There is a bit of a conflict on the merge currently because of a couple patches that went into trunk. I will fix this and post a final merge patch this evening.
          Hide
          Todd Lipcon added a comment -

          Up-to-date merge

          Show
          Todd Lipcon added a comment - Up-to-date merge
          Hide
          Todd Lipcon added a comment -

          test-patch:

               [exec] -1 overall.  
               [exec] 
               [exec]     +1 @author.  The patch does not contain any @author tags.
               [exec] 
               [exec]     +1 tests included.  The patch appears to include 148 new or modified tests.
               [exec] 
               [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
               [exec] 
               [exec]     -1 javac.  The applied patch generated 32 javac compiler warnings (more than the trunk's current 23 warnings).
               [exec] 
               [exec]     +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.8) warnings.
               [exec] 
               [exec]     -1 release audit.  The applied patch generated 1 release audit warnings (more than the trunk's current 0 warnings).
               [exec] 
               [exec]     +1 system test framework.  The patch passed system test framework compile.
          

          -1 javac. The applied patch generated 32 javac compiler warnings (more than the trunk's current 23 warnings).

          This is just due to more references to SecondaryNameNode, which is officially deprecated. We now have much better test coverage, hence more deprecation warnings.

          -1 release audit.

          This is for CHANGES.HDFS-1073.txt, which will be merged into CHANGES.txt when the svn trees are actually merged.

          Running unit tests now.

          Show
          Todd Lipcon added a comment - test-patch: [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 148 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] -1 javac. The applied patch generated 32 javac compiler warnings (more than the trunk's current 23 warnings). [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.8) warnings. [exec] [exec] -1 release audit. The applied patch generated 1 release audit warnings (more than the trunk's current 0 warnings). [exec] [exec] +1 system test framework. The patch passed system test framework compile. -1 javac. The applied patch generated 32 javac compiler warnings (more than the trunk's current 23 warnings). This is just due to more references to SecondaryNameNode, which is officially deprecated. We now have much better test coverage, hence more deprecation warnings. -1 release audit. This is for CHANGES. HDFS-1073 .txt, which will be merged into CHANGES.txt when the svn trees are actually merged. Running unit tests now.
          Hide
          Matt Foley added a comment -

          I still believe that HDFS-2136 is very important. Please keep it on the post-merge cleanup list. Thanks.

          Show
          Matt Foley added a comment - I still believe that HDFS-2136 is very important. Please keep it on the post-merge cleanup list. Thanks.
          Hide
          Todd Lipcon added a comment -

          Unit tests passed except for a couple things which also had issues on trunk (eg TestHDFSCLI and a few timouts due to HDFS-2213 which did not reproduce when I reran the tests in question). I will commit this to trunk momentarily.

          Show
          Todd Lipcon added a comment - Unit tests passed except for a couple things which also had issues on trunk (eg TestHDFSCLI and a few timouts due to HDFS-2213 which did not reproduce when I reran the tests in question). I will commit this to trunk momentarily.
          Hide
          Konstantin Shvachko added a comment -

          I reviewed this some more and didn't find any outstanding issue. I am also +1. Good job, Todd!

          • Let's just fix those test failures.
          • For Java warnings, could you please add @suppress for the deprecations you mentioned. We generally should target zero warnings in the code.

          One general comment.
          The patch has started as fairly straight forward change in the structure of journal files, but end up changing many different parts essentially rewriting some major components of HDFS. Some people mentioned that in way it is an abuse of the idea of using dev branches for large changes. In the extreme it would look like somebody is making a change and piggybacking everything he ever wanted to do with the system. Not to criticize the work done, but to keep in mind in the future that there should be a fine balance between what is done in the trunk and what goes into the branch. E.g. refactoring is on the trunk, main logic on the branch.

          Show
          Konstantin Shvachko added a comment - I reviewed this some more and didn't find any outstanding issue. I am also +1. Good job, Todd! Let's just fix those test failures. For Java warnings, could you please add @suppress for the deprecations you mentioned. We generally should target zero warnings in the code. One general comment. The patch has started as fairly straight forward change in the structure of journal files, but end up changing many different parts essentially rewriting some major components of HDFS. Some people mentioned that in way it is an abuse of the idea of using dev branches for large changes. In the extreme it would look like somebody is making a change and piggybacking everything he ever wanted to do with the system. Not to criticize the work done, but to keep in mind in the future that there should be a fine balance between what is done in the trunk and what goes into the branch. E.g. refactoring is on the trunk, main logic on the branch.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #738 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/738/)
          HDFS-1073. Redesign the NameNode's storage layout for image checkpoints and edit logs to introduce transaction IDs and be more robust. Contributed by Todd Lipcon and Ivan Kelly.

          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1152295
          Files :

          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileAppend4.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/FSImageTestUtil.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestStartupOptionUpgrade.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/EditsLoaderCurrent.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/NamespaceInfo.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestFSImage.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/CreateEditsLog.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestFsLimits.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogRace.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOp.java
          • /hadoop/common/trunk/hdfs/ivy/libraries.properties
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/tools/offlineEditsViewer/editsStored.xml
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOpCodes.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/Checkpointer.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestFSEditLogLoader.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/MiniDFSCluster.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileInputStream.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/util/TestMD5FileUtils.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestStartup.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSUpgrade.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/GetImageServlet.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSStorageStateRecovery.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestCheckpoint.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogJournalFailures.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImagePreTransactionalStorageInspector.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/common/StorageAdapter.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestFSImageStorageInspector.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestNNStorageRetentionManager.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/EditsElement.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorageRetentionManager.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImageStorageInspector.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestTransferFsImage.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/common/StorageInfo.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImageTransactionalStorageInspector.java
          • /hadoop/common/trunk/hdfs/ivy.xml
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/test/GenericTestUtils.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/TransferFsImage.java
          • /hadoop/common/trunk/hdfs/src/test/findbugsExcludeFile.xml
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestCheckPointForSecurityTokens.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/util/MD5FileUtils.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/JournalProtocol.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/tools/offlineEditsViewer/TestOfflineEditsViewer.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/common/Storage.java
          • /hadoop/common/trunk/hdfs/CHANGES.txt
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BackupNode.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/util/TestAtomicFileOutputStream.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/UpgradeUtilities.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestBackupNode.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/CheckpointCommand.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestSecurityTokenEditLog.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/NamenodeRegistration.java
          • /hadoop/common/trunk/hdfs/src/docs/src/documentation/content/xdocs/hdfs_user_guide.xml
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/NamenodeProtocol.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BackupJournalManager.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/protocol/FSConstants.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestSaveNamespace.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSUpgradeFromImage.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/OfflineEditsViewerHelper.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/RemoteEditLog.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestNameEditsConfigs.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLog.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogBackupInputStream.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogBackupOutputStream.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogFileOutputStream.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/ImageVisitor.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestClusterId.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestParallelImageWrite.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileOutputStream.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/tools/offlineEditsViewer/editsStored
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestNNStorageRetentionFunctional.java
          • /hadoop/common/trunk/hdfs/src/java/hdfs-default.xml
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/ImageLoaderCurrent.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/OfflineEditsViewer.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/protocol/LayoutVersion.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/CheckpointSignature.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSRollback.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/RemoteEditLogManifest.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/Tokenizer.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/util/AtomicFileOutputStream.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSFinalize.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/JournalManager.java
          • /hadoop/common/trunk/hdfs/bin/hdfs
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/HdfsConfiguration.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestStorageRestore.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogOutputStream.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditsDoubleBuffer.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewer.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #738 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/738/ ) HDFS-1073 . Redesign the NameNode's storage layout for image checkpoints and edit logs to introduce transaction IDs and be more robust. Contributed by Todd Lipcon and Ivan Kelly. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1152295 Files : /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileAppend4.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/FSImageTestUtil.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestStartupOptionUpgrade.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/EditsLoaderCurrent.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/NamespaceInfo.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestFSImage.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/CreateEditsLog.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestFsLimits.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogRace.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOp.java /hadoop/common/trunk/hdfs/ivy/libraries.properties /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/tools/offlineEditsViewer/editsStored.xml /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOpCodes.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/Checkpointer.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestFSEditLogLoader.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/MiniDFSCluster.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileInputStream.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/util/TestMD5FileUtils.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestStartup.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSUpgrade.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/GetImageServlet.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSStorageStateRecovery.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestCheckpoint.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogJournalFailures.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImagePreTransactionalStorageInspector.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/common/StorageAdapter.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestFSImageStorageInspector.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestNNStorageRetentionManager.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/EditsElement.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorageRetentionManager.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImageStorageInspector.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestTransferFsImage.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/common/StorageInfo.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImageTransactionalStorageInspector.java /hadoop/common/trunk/hdfs/ivy.xml /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/test/GenericTestUtils.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/TransferFsImage.java /hadoop/common/trunk/hdfs/src/test/findbugsExcludeFile.xml /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestCheckPointForSecurityTokens.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/util/MD5FileUtils.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/JournalProtocol.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/tools/offlineEditsViewer/TestOfflineEditsViewer.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/common/Storage.java /hadoop/common/trunk/hdfs/CHANGES.txt /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BackupNode.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/util/TestAtomicFileOutputStream.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/UpgradeUtilities.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestBackupNode.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/CheckpointCommand.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestSecurityTokenEditLog.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/NamenodeRegistration.java /hadoop/common/trunk/hdfs/src/docs/src/documentation/content/xdocs/hdfs_user_guide.xml /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/NamenodeProtocol.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BackupJournalManager.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/protocol/FSConstants.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestSaveNamespace.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSUpgradeFromImage.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/OfflineEditsViewerHelper.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/RemoteEditLog.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestNameEditsConfigs.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLog.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogBackupInputStream.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogBackupOutputStream.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogFileOutputStream.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/ImageVisitor.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestClusterId.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestParallelImageWrite.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileOutputStream.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/tools/offlineEditsViewer/editsStored /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestNNStorageRetentionFunctional.java /hadoop/common/trunk/hdfs/src/java/hdfs-default.xml /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/ImageLoaderCurrent.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/OfflineEditsViewer.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/protocol/LayoutVersion.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/CheckpointSignature.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSRollback.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/RemoteEditLogManifest.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/Tokenizer.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/util/AtomicFileOutputStream.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSFinalize.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/JournalManager.java /hadoop/common/trunk/hdfs/bin/hdfs /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/HdfsConfiguration.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestStorageRestore.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/DFSConfigKeys.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogOutputStream.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditsDoubleBuffer.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewer.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #812 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/812/)
          HDFS-1073. Redesign the NameNode's storage layout for image checkpoints and edit logs to introduce transaction IDs and be more robust. Contributed by Todd Lipcon and Ivan Kelly.

          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1152295
          Files :

          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileAppend4.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/FSImageTestUtil.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestStartupOptionUpgrade.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/EditsLoaderCurrent.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/NamespaceInfo.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestFSImage.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/CreateEditsLog.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestFsLimits.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogRace.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOp.java
          • /hadoop/common/trunk/hdfs/ivy/libraries.properties
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/tools/offlineEditsViewer/editsStored.xml
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOpCodes.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/Checkpointer.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestFSEditLogLoader.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/MiniDFSCluster.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileInputStream.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/util/TestMD5FileUtils.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestStartup.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSUpgrade.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/GetImageServlet.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSStorageStateRecovery.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestCheckpoint.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogJournalFailures.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImagePreTransactionalStorageInspector.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/common/StorageAdapter.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestFSImageStorageInspector.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestNNStorageRetentionManager.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/EditsElement.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorageRetentionManager.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImageStorageInspector.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestTransferFsImage.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/common/StorageInfo.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImageTransactionalStorageInspector.java
          • /hadoop/common/trunk/hdfs/ivy.xml
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/test/GenericTestUtils.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/TransferFsImage.java
          • /hadoop/common/trunk/hdfs/src/test/findbugsExcludeFile.xml
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestCheckPointForSecurityTokens.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/util/MD5FileUtils.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/JournalProtocol.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/tools/offlineEditsViewer/TestOfflineEditsViewer.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/common/Storage.java
          • /hadoop/common/trunk/hdfs/CHANGES.txt
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BackupNode.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/util/TestAtomicFileOutputStream.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/UpgradeUtilities.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestBackupNode.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/CheckpointCommand.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestSecurityTokenEditLog.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/NamenodeRegistration.java
          • /hadoop/common/trunk/hdfs/src/docs/src/documentation/content/xdocs/hdfs_user_guide.xml
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/NamenodeProtocol.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BackupJournalManager.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/protocol/FSConstants.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestSaveNamespace.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSUpgradeFromImage.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/OfflineEditsViewerHelper.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/RemoteEditLog.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestNameEditsConfigs.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLog.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogBackupInputStream.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogBackupOutputStream.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogFileOutputStream.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/ImageVisitor.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestClusterId.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestParallelImageWrite.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileOutputStream.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/tools/offlineEditsViewer/editsStored
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestNNStorageRetentionFunctional.java
          • /hadoop/common/trunk/hdfs/src/java/hdfs-default.xml
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/ImageLoaderCurrent.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/OfflineEditsViewer.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/protocol/LayoutVersion.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/CheckpointSignature.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSRollback.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/RemoteEditLogManifest.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/Tokenizer.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/util/AtomicFileOutputStream.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSFinalize.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/JournalManager.java
          • /hadoop/common/trunk/hdfs/bin/hdfs
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/HdfsConfiguration.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestStorageRestore.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogOutputStream.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditsDoubleBuffer.java
          • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewer.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #812 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/812/ ) HDFS-1073 . Redesign the NameNode's storage layout for image checkpoints and edit logs to introduce transaction IDs and be more robust. Contributed by Todd Lipcon and Ivan Kelly. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1152295 Files : /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileAppend4.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/FSImageTestUtil.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestStartupOptionUpgrade.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/EditsLoaderCurrent.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/NamespaceInfo.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestFSImage.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/CreateEditsLog.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestFsLimits.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogRace.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOp.java /hadoop/common/trunk/hdfs/ivy/libraries.properties /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/tools/offlineEditsViewer/editsStored.xml /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOpCodes.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/Checkpointer.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestFSEditLogLoader.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/MiniDFSCluster.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileInputStream.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/util/TestMD5FileUtils.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestStartup.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSUpgrade.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/GetImageServlet.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSStorageStateRecovery.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestCheckpoint.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogJournalFailures.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImagePreTransactionalStorageInspector.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/common/StorageAdapter.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestFSImageStorageInspector.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestNNStorageRetentionManager.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/EditsElement.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorageRetentionManager.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImageStorageInspector.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestTransferFsImage.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/common/StorageInfo.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImageTransactionalStorageInspector.java /hadoop/common/trunk/hdfs/ivy.xml /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/test/GenericTestUtils.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/TransferFsImage.java /hadoop/common/trunk/hdfs/src/test/findbugsExcludeFile.xml /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestCheckPointForSecurityTokens.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/util/MD5FileUtils.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/JournalProtocol.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/tools/offlineEditsViewer/TestOfflineEditsViewer.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/common/Storage.java /hadoop/common/trunk/hdfs/CHANGES.txt /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BackupNode.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/util/TestAtomicFileOutputStream.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/UpgradeUtilities.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestBackupNode.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/CheckpointCommand.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestSecurityTokenEditLog.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/NamenodeRegistration.java /hadoop/common/trunk/hdfs/src/docs/src/documentation/content/xdocs/hdfs_user_guide.xml /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/NamenodeProtocol.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BackupJournalManager.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/protocol/FSConstants.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestSaveNamespace.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSUpgradeFromImage.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/OfflineEditsViewerHelper.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/RemoteEditLog.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestNameEditsConfigs.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLog.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogBackupInputStream.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogBackupOutputStream.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogFileOutputStream.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/ImageVisitor.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestClusterId.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestParallelImageWrite.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileOutputStream.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/tools/offlineEditsViewer/editsStored /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestNNStorageRetentionFunctional.java /hadoop/common/trunk/hdfs/src/java/hdfs-default.xml /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/ImageLoaderCurrent.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/OfflineEditsViewer.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/protocol/LayoutVersion.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/CheckpointSignature.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSRollback.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/protocol/RemoteEditLogManifest.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/Tokenizer.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/util/AtomicFileOutputStream.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSFinalize.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/JournalManager.java /hadoop/common/trunk/hdfs/bin/hdfs /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/HdfsConfiguration.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestStorageRestore.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/DFSConfigKeys.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogOutputStream.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditsDoubleBuffer.java /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewer.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-1073-branch #23 (See https://builds.apache.org/job/Hadoop-Hdfs-1073-branch/23/)
          Merge trunk into HDFS-1073.

          Resolved several conflicts due to merge of HDFS-2149 and HDFS-2212.
          Changes during resolution were:

          • move the writing of the transaction ID out of EditLogOutputStream to
            FSEditLogOp.Writer to match trunk's organization
          • remove JSPOOL related FsEditLogOp subclasses, add LogSegmentOp subclasses
          • modify TestEditLogJournalFailures to not keep trying to use streams after
            the simulated halt, since newer stricter assertions caused these writes to
            fail

          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1152128
          Files :

          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileAppend2.java
          • /hadoop/common/branches/HDFS-1073/hdfs/CHANGES.txt
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOp.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/DecommissionManager.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/blockmanagement/TestHeartbeatHandling.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/blockmanagement/BlockManagerTestUtil.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileStatus.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/docs/src/documentation/content/xdocs/hdfsproxy.xml
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestComputeInvalidateWork.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/contrib/build.xml
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestSaveNamespace.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDatanodeDeath.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DirectoryScanner.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileCreationDelete.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSUpgradeFromImage.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestClusterId.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileAppend3.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/common/JspHelper.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/DFSClient.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogJournalFailures.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileConcurrentReader.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/docs/src/documentation/content/xdocs/site.xml
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/unit/org/apache/hadoop/hdfs/server/datanode/TestBlockRecovery.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/balancer/TestBalancerWithMultipleNameNodes.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDecommission.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditsDoubleBuffer.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestLeaseRecovery2.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/common/Storage.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/webapps/datanode
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestMultiThreadedHflush.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileAppend4.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/UpgradeUtilities.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/contrib/hdfsproxy
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceStorage.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileCreation.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/aop/org/apache/hadoop/hdfs/TestFiPipelines.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileCorruption.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/blockmanagement/TestComputeInvalidateWork.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestPipelines.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogBackupOutputStream.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/webapps/hdfs
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileOutputStream.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestReadWhileWriting.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/webapps/secondary
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/NameNodeAdapter.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestRenameWhileOpen.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileCreationEmpty.java
          • /hadoop/common/branches/HDFS-1073/hdfs
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/datanode/TestBlockReport.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/c++/libhdfs
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestStorageRestore.java
          • /hadoop/common/branches/HDFS-1073/hdfs/build.xml
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileCreationClient.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestDeadDatanode.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/PendingReplicationBlocks.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogOutputStream.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditsDoubleBuffer.java
          • /hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-1073-branch #23 (See https://builds.apache.org/job/Hadoop-Hdfs-1073-branch/23/ ) Merge trunk into HDFS-1073 . Resolved several conflicts due to merge of HDFS-2149 and HDFS-2212 . Changes during resolution were: move the writing of the transaction ID out of EditLogOutputStream to FSEditLogOp.Writer to match trunk's organization remove JSPOOL related FsEditLogOp subclasses, add LogSegmentOp subclasses modify TestEditLogJournalFailures to not keep trying to use streams after the simulated halt, since newer stricter assertions caused these writes to fail todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1152128 Files : /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileAppend2.java /hadoop/common/branches/ HDFS-1073 /hdfs/CHANGES.txt /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOp.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/DecommissionManager.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/blockmanagement/TestHeartbeatHandling.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/blockmanagement/BlockManagerTestUtil.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileStatus.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/docs/src/documentation/content/xdocs/hdfsproxy.xml /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestComputeInvalidateWork.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/contrib/build.xml /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestSaveNamespace.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDatanodeDeath.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DirectoryScanner.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileCreationDelete.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSUpgradeFromImage.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestClusterId.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileAppend3.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/common/JspHelper.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/DFSClient.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogJournalFailures.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileConcurrentReader.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/docs/src/documentation/content/xdocs/site.xml /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/unit/org/apache/hadoop/hdfs/server/datanode/TestBlockRecovery.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/balancer/TestBalancerWithMultipleNameNodes.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDecommission.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditsDoubleBuffer.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestLeaseRecovery2.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/common/Storage.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/webapps/datanode /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestMultiThreadedHflush.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileAppend4.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/UpgradeUtilities.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/contrib/hdfsproxy /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceStorage.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileCreation.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/aop/org/apache/hadoop/hdfs/TestFiPipelines.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileCorruption.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/blockmanagement/TestComputeInvalidateWork.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestPipelines.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogBackupOutputStream.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/webapps/hdfs /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileOutputStream.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestReadWhileWriting.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/webapps/secondary /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/NameNodeAdapter.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestRenameWhileOpen.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileCreationEmpty.java /hadoop/common/branches/ HDFS-1073 /hdfs /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/datanode/TestBlockReport.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/c++/libhdfs /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestStorageRestore.java /hadoop/common/branches/ HDFS-1073 /hdfs/build.xml /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestFileCreationClient.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestDeadDatanode.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/PendingReplicationBlocks.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogOutputStream.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditsDoubleBuffer.java /hadoop/common/branches/ HDFS-1073 /hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #1001 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1001/)
          Fix CHANGES.txt to include complete subtask list for HDFS-1073.

          Somehow in the merge, some subtasks were lost from CHANGES.txt.
          I spot-checked these patches to make sure they were in fact merged,
          and it was only CHANGES.txt that was missing them.

          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1178610
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #1001 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1001/ ) Fix CHANGES.txt to include complete subtask list for HDFS-1073 . Somehow in the merge, some subtasks were lost from CHANGES.txt. I spot-checked these patches to make sure they were in fact merged, and it was only CHANGES.txt that was missing them. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1178610 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #1079 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1079/)
          Fix CHANGES.txt to include complete subtask list for HDFS-1073.

          Somehow in the merge, some subtasks were lost from CHANGES.txt.
          I spot-checked these patches to make sure they were in fact merged,
          and it was only CHANGES.txt that was missing them.

          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1178610
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #1079 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1079/ ) Fix CHANGES.txt to include complete subtask list for HDFS-1073 . Somehow in the merge, some subtasks were lost from CHANGES.txt. I spot-checked these patches to make sure they were in fact merged, and it was only CHANGES.txt that was missing them. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1178610 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #1021 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1021/)
          Fix CHANGES.txt to include complete subtask list for HDFS-1073.

          Somehow in the merge, some subtasks were lost from CHANGES.txt.
          I spot-checked these patches to make sure they were in fact merged,
          and it was only CHANGES.txt that was missing them.

          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1178610
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #1021 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1021/ ) Fix CHANGES.txt to include complete subtask list for HDFS-1073 . Somehow in the merge, some subtasks were lost from CHANGES.txt. I spot-checked these patches to make sure they were in fact merged, and it was only CHANGES.txt that was missing them. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1178610 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-0.23-Build #36 (See https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/36/)
          Fix CHANGES.txt to include complete subtask list for HDFS-1073.

          Somehow in the merge, some subtasks were lost from CHANGES.txt.
          I spot-checked these patches to make sure they were in fact merged,
          and it was only CHANGES.txt that was missing them.

          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1178611
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-0.23-Build #36 (See https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/36/ ) Fix CHANGES.txt to include complete subtask list for HDFS-1073 . Somehow in the merge, some subtasks were lost from CHANGES.txt. I spot-checked these patches to make sure they were in fact merged, and it was only CHANGES.txt that was missing them. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1178611 Files : /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #820 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/820/)
          Fix CHANGES.txt to include complete subtask list for HDFS-1073.

          Somehow in the merge, some subtasks were lost from CHANGES.txt.
          I spot-checked these patches to make sure they were in fact merged,
          and it was only CHANGES.txt that was missing them.

          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1178610
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #820 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/820/ ) Fix CHANGES.txt to include complete subtask list for HDFS-1073 . Somehow in the merge, some subtasks were lost from CHANGES.txt. I spot-checked these patches to make sure they were in fact merged, and it was only CHANGES.txt that was missing them. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1178610 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-0.23-Build #29 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/29/)
          Fix CHANGES.txt to include complete subtask list for HDFS-1073.

          Somehow in the merge, some subtasks were lost from CHANGES.txt.
          I spot-checked these patches to make sure they were in fact merged,
          and it was only CHANGES.txt that was missing them.

          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1178611
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-0.23-Build #29 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/29/ ) Fix CHANGES.txt to include complete subtask list for HDFS-1073 . Somehow in the merge, some subtasks were lost from CHANGES.txt. I spot-checked these patches to make sure they were in fact merged, and it was only CHANGES.txt that was missing them. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1178611 Files : /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #850 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/850/)
          Fix CHANGES.txt to include complete subtask list for HDFS-1073.

          Somehow in the merge, some subtasks were lost from CHANGES.txt.
          I spot-checked these patches to make sure they were in fact merged,
          and it was only CHANGES.txt that was missing them.

          todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1178610
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #850 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/850/ ) Fix CHANGES.txt to include complete subtask list for HDFS-1073 . Somehow in the merge, some subtasks were lost from CHANGES.txt. I spot-checked these patches to make sure they were in fact merged, and it was only CHANGES.txt that was missing them. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1178610 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt

            People

            • Assignee:
              Todd Lipcon
              Reporter:
              Sanjay Radia
            • Votes:
              0 Vote for this issue
              Watchers:
              47 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development