Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-11382

Persist Erasure Coding Policy ID in a new optional field in INodeFile in FSImage

    XMLWordPrintableJSON

Details

    • Task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0-alpha1
    • 3.0.0-alpha4
    • hdfs
    • None
    • Incompatible change
    • Hide
      The FSImage on-disk format for INodeFile is changed to additionally include a field for Erasure Coded files. This optional field 'erasureCodingPolicyID' which is unit32 type is available for all Erasure Coded files and represents the Erasure Coding Policy ID. Previously, the 'replication' field in INodeFile disk format was overloaded to represent the same Erasure Coding Policy ID.
      Show
      The FSImage on-disk format for INodeFile is changed to additionally include a field for Erasure Coded files. This optional field 'erasureCodingPolicyID' which is unit32 type is available for all Erasure Coded files and represents the Erasure Coding Policy ID. Previously, the 'replication' field in INodeFile disk format was overloaded to represent the same Erasure Coding Policy ID.

    Description

      For Erasure Coded files, replication field in INodeFile message is re-used to store the EC Policy ID.

      FSDirWriteFileOp#addFile

        private static INodesInPath addFile(
            FSDirectory fsd, INodesInPath existing, byte[] localName,
            PermissionStatus permissions, short replication, long preferredBlockSize,
            String clientName, String clientMachine)
            throws IOException {
          .. .. ..
          try {
            ErasureCodingPolicy ecPolicy = FSDirErasureCodingOp.
                getErasureCodingPolicy(fsd.getFSNamesystem(), existing);
            if (ecPolicy != null) {
              replication = ecPolicy.getId();   <===
            }
            final BlockType blockType = ecPolicy != null?
                BlockType.STRIPED : BlockType.CONTIGUOUS;
            INodeFile newNode = newINodeFile(fsd.allocateNewInodeId(), permissions,
                modTime, modTime, replication, preferredBlockSize, blockType);
            newNode.setLocalName(localName);
            newNode.toUnderConstruction(clientName, clientMachine);
            newiip = fsd.addINode(existing, newNode, permissions.getPermission());
      

      With HDFS-11268 fix, FSImageFormatPBINode#Loader#loadInodeFile is rightly getting the EC ID from the replication field and then uses the right Policy to construct the blocks.
      FSImageFormatPBINode#Loader#loadInodeFile

            ErasureCodingPolicy ecPolicy = (blockType == BlockType.STRIPED) ?
                ErasureCodingPolicyManager.getPolicyByPolicyID((byte) replication) :
                null;
      

      The original intention was to re-use the replication field so the in-memory representation would be compact. But, this isn't necessary for the on-disk representation. replication is an optional field, and if we add another optional field for the EC policy, it won't be any extra space.

      Also, we need to make sure to have the appropriate asserts in place to make sure both fields aren't set for the same INodeField.

      Attachments

        1. HDFS-11382.05.patch
          44 kB
          Manoj Govindassamy
        2. HDFS-11382.04.patch
          44 kB
          Manoj Govindassamy
        3. HDFS-11382.03.patch
          38 kB
          Manoj Govindassamy
        4. HDFS-11382.02.patch
          37 kB
          Manoj Govindassamy
        5. HDFS-11382.01.patch
          18 kB
          Manoj Govindassamy

        Issue Links

          Activity

            People

              manojg Manoj Govindassamy
              manojg Manoj Govindassamy
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: