Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-5698

Use protobuf to serialize / deserialize FSImage

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.4.0
    • Component/s: namenode
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Use protobuf to serialize/deserialize the FSImage.

      Description

      Currently, the code serializes FSImage using in-house serialization mechanisms. There are a couple disadvantages of the current approach:

      1. Mixing the responsibility of reconstruction and serialization / deserialization. The current code paths of serialization / deserialization have spent a lot of effort on maintaining compatibility. What is worse is that they are mixed with the complex logic of reconstructing the namespace, making the code difficult to follow.
      2. Poor documentation of the current FSImage format. The format of the FSImage is practically defined by the implementation. An bug in implementation means a bug in the specification. Furthermore, it also makes writing third-party tools quite difficult.
      3. Changing schemas is non-trivial. Adding a field in FSImage requires bumping the layout version every time. Bumping out layout version requires (1) the users to explicitly upgrade the clusters, and (2) putting new code to maintain backward compatibility.

      This jira proposes to use protobuf to serialize the FSImage. Protobuf has been used to serialize / deserialize the RPC message in Hadoop.

      Protobuf addresses all the above problems. It clearly separates the responsibility of serialization and reconstructing the namespace. The protobuf files document the current format of the FSImage. The developers now can add optional fields with ease, since the old code can always read the new FSImage.

      1. HDFS-5698-branch2.000.patch
        182 kB
        Haohui Mai
      2. HDFS-5698.007.patch
        179 kB
        Haohui Mai
      3. HDFS-5698.006.patch
        179 kB
        Haohui Mai
      4. HDFS-5698.005.patch
        178 kB
        Haohui Mai
      5. HDFS-5698.004.patch
        196 kB
        Haohui Mai
      6. HDFS-5698.003.patch
        178 kB
        Haohui Mai
      7. HDFS-5698-design.pdf
        118 kB
        Haohui Mai
      8. HDFS-5698.002.patch
        179 kB
        Haohui Mai
      9. HDFS-5698.001.patch
        175 kB
        Haohui Mai
      10. HDFS-5698.000.patch
        107 kB
        Haohui Mai

        Issue Links

        1.
        Save FSImage header in protobuf Sub-task Resolved Haohui Mai
         
        2.
        Implement compression in the HTTP server of SNN / SBN instead of FSImage Sub-task Resolved Unassigned
         
        3.
        Remove compression support from FSImage Sub-task Resolved Haohui Mai
         
        4.
        Serialize INode information in protobuf Sub-task Resolved Haohui Mai
         
        5.
        Use protobuf to serialize snapshot information Sub-task Resolved Jing Zhao
         
        6.
        Serialize information for token managers in protobuf Sub-task Resolved Haohui Mai
         
        7.
        Track progress when loading fsimage Sub-task Resolved Haohui Mai
         
        8.
        Serialize under-construction file information in FSImage Sub-task Resolved Jing Zhao
         
        9.
        Serialize CachePool directives in protobuf Sub-task Resolved Haohui Mai
         
        10.
        Compute the digest before loading FSImage Sub-task Resolved Haohui Mai
         
        11.
        Serialize symlink in protobuf Sub-task Resolved Haohui Mai
         
        12.
        Optimize the serialization of PermissionStatus Sub-task Resolved Haohui Mai
         
        13.
        Implement offline image viewer. Sub-task Resolved Haohui Mai
         
        14.
        Implement cancellation when saving FSImage Sub-task Resolved Haohui Mai
         
        15.
        Add a Type field in Snapshot DiffEntry's protobuf definition Sub-task Resolved Jing Zhao
         
        16.
        Update the stored edit logs to be consistent with the changes in HDFS-5698 branch Sub-task Resolved Haohui Mai
         
        17.
        Consolidate INodeReference into a separate section Sub-task Closed Jing Zhao
         
        18.
        Use PBHelper to serialize CacheDirectiveInfoExpirationProto Sub-task Resolved Haohui Mai
         
        19.
        LoadDelegator should use IOUtils.readFully() to read the magic header Sub-task Resolved Haohui Mai
         
        20.
        Add annotation for repeated fields in the protobuf definition Sub-task Resolved Haohui Mai
         
        21. Add suffix to generated protobuf class Sub-task Patch Available Tassapol Athiapinya
         
        22.
        Fixing findbugs and javadoc warnings in the HDFS-5698 branch Sub-task Resolved Haohui Mai
         
        23.
        The id of a CacheDirective instance does not get serialized in the protobuf-fsimage Sub-task Resolved Haohui Mai
         

          Activity

          No work has yet been logged on this issue.

            People

            • Assignee:
              Haohui Mai
              Reporter:
              Haohui Mai
            • Votes:
              0 Vote for this issue
              Watchers:
              23 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development