Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-5284 Flatten INode hierarchy
  3. HDFS-5714

Use byte array to represent UnderConstruction feature and Snapshot feature for INodeFile

    Details

    • Type: Sub-task Sub-task
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: namenode
    • Labels:
      None

      Description

      Currently we define specific classes to represent different INode features, such as FileUnderConstructionFeature and FileWithSnapshotFeature. While recording these feature information in memory, the internal information and object references can still cost a lot of memory. For example, for FileWithSnapshotFeature, not considering the INode's local name, the whole FileDiff list (with size n) can cost around 120n bytes.

      In order to decrease the memory usage, we plan to use byte array to record the UnderConstruction feature and Snapshot feature for INodeFile. Specifically, if we use protobuf's encoding, the memory usage for a FileWithSnapshotFeature can be less than 56n bytes.

        Activity

        Jing Zhao made changes -
        Attachment HDFS-5714.000.patch [ 12622010 ]
        Hide
        Jing Zhao added a comment -

        Early patch for review. In general, the patch
        1. Encodes the whole FileDiffList into a byte array. Instead of always keeping the byte array in memory, currently the patch only encodes a FileDiffList to a byte array when loading it from FSImage for the first time. And later if the corresponding snapshot information is accessed the byte array will be decoded to the FileDiffList and will not be encoded again (until the next time NN restarting).
        2. Remove ClientNode from FileUnderConstructionFeature and use a byte array to represent the ClientName and ClientMachine.

        Show
        Jing Zhao added a comment - Early patch for review. In general, the patch 1. Encodes the whole FileDiffList into a byte array. Instead of always keeping the byte array in memory, currently the patch only encodes a FileDiffList to a byte array when loading it from FSImage for the first time. And later if the corresponding snapshot information is accessed the byte array will be decoded to the FileDiffList and will not be encoded again (until the next time NN restarting). 2. Remove ClientNode from FileUnderConstructionFeature and use a byte array to represent the ClientName and ClientMachine.
        Jing Zhao made changes -
        Field Original Value New Value
        Component/s namenode [ 12312926 ]
        Jing Zhao created issue -

          People

          • Assignee:
            Jing Zhao
            Reporter:
            Jing Zhao
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:

              Development