Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-5698

Use protobuf to serialize / deserialize FSImage

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 2.4.0
    • namenode
    • None
    • Reviewed
    • Use protobuf to serialize/deserialize the FSImage.

    Description

      Currently, the code serializes FSImage using in-house serialization mechanisms. There are a couple disadvantages of the current approach:

      1. Mixing the responsibility of reconstruction and serialization / deserialization. The current code paths of serialization / deserialization have spent a lot of effort on maintaining compatibility. What is worse is that they are mixed with the complex logic of reconstructing the namespace, making the code difficult to follow.
      2. Poor documentation of the current FSImage format. The format of the FSImage is practically defined by the implementation. An bug in implementation means a bug in the specification. Furthermore, it also makes writing third-party tools quite difficult.
      3. Changing schemas is non-trivial. Adding a field in FSImage requires bumping the layout version every time. Bumping out layout version requires (1) the users to explicitly upgrade the clusters, and (2) putting new code to maintain backward compatibility.

      This jira proposes to use protobuf to serialize the FSImage. Protobuf has been used to serialize / deserialize the RPC message in Hadoop.

      Protobuf addresses all the above problems. It clearly separates the responsibility of serialization and reconstructing the namespace. The protobuf files document the current format of the FSImage. The developers now can add optional fields with ease, since the old code can always read the new FSImage.

      Attachments

        1. HDFS-5698-branch2.000.patch
          182 kB
          Haohui Mai
        2. HDFS-5698.007.patch
          179 kB
          Haohui Mai
        3. HDFS-5698.006.patch
          179 kB
          Haohui Mai
        4. HDFS-5698.005.patch
          178 kB
          Haohui Mai
        5. HDFS-5698.004.patch
          196 kB
          Haohui Mai
        6. HDFS-5698.003.patch
          178 kB
          Haohui Mai
        7. HDFS-5698-design.pdf
          118 kB
          Haohui Mai
        8. HDFS-5698.002.patch
          179 kB
          Haohui Mai
        9. HDFS-5698.001.patch
          175 kB
          Haohui Mai
        10. HDFS-5698.000.patch
          107 kB
          Haohui Mai

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            wheat9 Haohui Mai Assign to me
            wheat9 Haohui Mai
            Votes:
            0 Vote for this issue
            Watchers:
            24 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment