Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-11382

Persist Erasure Coding Policy ID in a new optional field in INodeFile in FSImage

    Details

    • Type: Task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0-alpha1
    • Fix Version/s: 3.0.0-alpha4
    • Component/s: hdfs
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Incompatible change
    • Release Note:
      Hide
      The FSImage on-disk format for INodeFile is changed to additionally include a field for Erasure Coded files. This optional field 'erasureCodingPolicyID' which is unit32 type is available for all Erasure Coded files and represents the Erasure Coding Policy ID. Previously, the 'replication' field in INodeFile disk format was overloaded to represent the same Erasure Coding Policy ID.
      Show
      The FSImage on-disk format for INodeFile is changed to additionally include a field for Erasure Coded files. This optional field 'erasureCodingPolicyID' which is unit32 type is available for all Erasure Coded files and represents the Erasure Coding Policy ID. Previously, the 'replication' field in INodeFile disk format was overloaded to represent the same Erasure Coding Policy ID.

      Description

      For Erasure Coded files, replication field in INodeFile message is re-used to store the EC Policy ID.

      FSDirWriteFileOp#addFile

        private static INodesInPath addFile(
            FSDirectory fsd, INodesInPath existing, byte[] localName,
            PermissionStatus permissions, short replication, long preferredBlockSize,
            String clientName, String clientMachine)
            throws IOException {
          .. .. ..
          try {
            ErasureCodingPolicy ecPolicy = FSDirErasureCodingOp.
                getErasureCodingPolicy(fsd.getFSNamesystem(), existing);
            if (ecPolicy != null) {
              replication = ecPolicy.getId();   <===
            }
            final BlockType blockType = ecPolicy != null?
                BlockType.STRIPED : BlockType.CONTIGUOUS;
            INodeFile newNode = newINodeFile(fsd.allocateNewInodeId(), permissions,
                modTime, modTime, replication, preferredBlockSize, blockType);
            newNode.setLocalName(localName);
            newNode.toUnderConstruction(clientName, clientMachine);
            newiip = fsd.addINode(existing, newNode, permissions.getPermission());
      

      With HDFS-11268 fix, FSImageFormatPBINode#Loader#loadInodeFile is rightly getting the EC ID from the replication field and then uses the right Policy to construct the blocks.
      FSImageFormatPBINode#Loader#loadInodeFile

            ErasureCodingPolicy ecPolicy = (blockType == BlockType.STRIPED) ?
                ErasureCodingPolicyManager.getPolicyByPolicyID((byte) replication) :
                null;
      

      The original intention was to re-use the replication field so the in-memory representation would be compact. But, this isn't necessary for the on-disk representation. replication is an optional field, and if we add another optional field for the EC policy, it won't be any extra space.

      Also, we need to make sure to have the appropriate asserts in place to make sure both fields aren't set for the same INodeField.

      1. HDFS-11382.01.patch
        18 kB
        Manoj Govindassamy
      2. HDFS-11382.02.patch
        37 kB
        Manoj Govindassamy
      3. HDFS-11382.03.patch
        38 kB
        Manoj Govindassamy
      4. HDFS-11382.04.patch
        44 kB
        Manoj Govindassamy
      5. HDFS-11382.05.patch
        44 kB
        Manoj Govindassamy

        Issue Links

          Activity

          Hide
          manojg Manoj Govindassamy added a comment -

          Attached v01 patch to address the following:
          1. New field erasureCodingPolicyID for fsimage.proto INodeFile.
          1. FSImageFormatPBINode Loader/Saver now reads/writes the newly defined on-disk field erasureCodingPolicyID in INodeFile
          2. INodeFile#getReplication() returns 0 fir Striped files
          3. TestErasureCodingPolicies, TestFSImage, TestOfflineImageViewer, TestOfflineImageViewerWithStripedBlocks includes tests for verifying replication factor of striped files, saving and loading of FSImage containing striped files and verifying its erasure coding policy id.

          Andrew Wang, can you please take a look at the patch ?

          Show
          manojg Manoj Govindassamy added a comment - Attached v01 patch to address the following: 1. New field erasureCodingPolicyID for fsimage.proto INodeFile. 1. FSImageFormatPBINode Loader/Saver now reads/writes the newly defined on-disk field erasureCodingPolicyID in INodeFile 2. INodeFile#getReplication() returns 0 fir Striped files 3. TestErasureCodingPolicies, TestFSImage, TestOfflineImageViewer, TestOfflineImageViewerWithStripedBlocks includes tests for verifying replication factor of striped files, saving and loading of FSImage containing striped files and verifying its erasure coding policy id. Andrew Wang , can you please take a look at the patch ?
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 15s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 4 new or modified test files.
          +1 mvninstall 13m 54s trunk passed
          +1 compile 0m 52s trunk passed
          +1 checkstyle 0m 30s trunk passed
          +1 mvnsite 0m 58s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          +1 findbugs 1m 56s trunk passed
          +1 javadoc 0m 42s trunk passed
          +1 mvninstall 0m 52s the patch passed
          +1 compile 0m 52s the patch passed
          +1 cc 0m 52s the patch passed
          +1 javac 0m 52s the patch passed
          +1 checkstyle 0m 29s the patch passed
          +1 mvnsite 0m 58s the patch passed
          +1 mvneclipse 0m 12s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 2m 1s the patch passed
          +1 javadoc 0m 39s the patch passed
          -1 unit 84m 10s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 24s The patch does not generate ASF License warnings.
          111m 20s



          Reason Tests
          Failed junit tests hadoop.hdfs.TestEncryptionZones
            hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure
            hadoop.hdfs.TestReadStripedFileWithMissingBlocks
            hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery
            hadoop.hdfs.tools.TestDFSAdminWithHA



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue HDFS-11382
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12850868/HDFS-11382.01.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc
          uname Linux 94fdf2dac27e 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 0914fcc
          Default Java 1.8.0_121
          findbugs v3.0.0
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/18325/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/18325/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18325/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 15s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 4 new or modified test files. +1 mvninstall 13m 54s trunk passed +1 compile 0m 52s trunk passed +1 checkstyle 0m 30s trunk passed +1 mvnsite 0m 58s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 1m 56s trunk passed +1 javadoc 0m 42s trunk passed +1 mvninstall 0m 52s the patch passed +1 compile 0m 52s the patch passed +1 cc 0m 52s the patch passed +1 javac 0m 52s the patch passed +1 checkstyle 0m 29s the patch passed +1 mvnsite 0m 58s the patch passed +1 mvneclipse 0m 12s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 2m 1s the patch passed +1 javadoc 0m 39s the patch passed -1 unit 84m 10s hadoop-hdfs in the patch failed. +1 asflicense 0m 24s The patch does not generate ASF License warnings. 111m 20s Reason Tests Failed junit tests hadoop.hdfs.TestEncryptionZones   hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure   hadoop.hdfs.TestReadStripedFileWithMissingBlocks   hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery   hadoop.hdfs.tools.TestDFSAdminWithHA Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HDFS-11382 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12850868/HDFS-11382.01.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc uname Linux 94fdf2dac27e 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 0914fcc Default Java 1.8.0_121 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-HDFS-Build/18325/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/18325/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18325/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          manojg Manoj Govindassamy added a comment -

          Test failures doesn't look related to patch. All failed tests are passing locally for me, with the patch.

          Show
          manojg Manoj Govindassamy added a comment - Test failures doesn't look related to patch. All failed tests are passing locally for me, with the patch.
          Hide
          andrew.wang Andrew Wang added a comment -

          Thanks for working on this Manoj! Overall looks good, some review comments:

          • Noticed we now have a BlockTypeProto as well as erasureCodingPolicyID now in the INodeFile PB. The history here is that we used to have an isStriped boolean, which turned into the BlockTypeProto enum in HDFS-10759. Now that we have this optional field, it seems like BlockTypeProto is unnecessary. Ewan Higgs, any thoughts on this?
          • Related, I'm wondering if we should make isStriped just be return getBlockType() == STRIPED, looks equivalent to me.
          • FSImageFormatPBINode, it'd be better to check if the policy is present with f.hasErasureCodingPolicyID rather than getting based on if the block type is striped. This way we don't accidentally get the default value of 0.
          • INodeFile, I think having the combined replicationOrEcPolicyId parameter is confusing from the caller's perspective. It'd be better to have separate parameters that pass null or -1 if not set, and combine as appropriate in INodeFile.HeaderFormat#toLong (with requisite Precondition checks). This relates to the comment in toLong about this being hacky, I dug up this comment from Zhe during the initial review:

          The biggest concern is INodeFile constructor – related to that, the toLong method. Currently when isStriped, we just interpret replication as the EC policy ID. This looks pretty hacky. But it looks pretty tricky to fix. How about adding a TODO here and file a follow-on? I think what we need to do is to add some util methods to convert both 0 + replicationFactor and 1 + ecPolicy into a short typed variable, like blockLayoutRedudancy, and pass it to the constructor.

          Seems like this JIRA is a good time to fix the hack

          • FSImageFormatPBInode#buildINodeFile, it looks like we unconditionally set the EC policy ID. Shouldn't we only set when there is a policy set, similar to the AclFeature and XAttrFeature? A unit test for this would be good.
          • INodeFile#getFileReplication, IIRC S3 returns a replication factor of 1. Could you check this and use the same setting for our striped files? IMO 1 is the most reasonable value here, since each data block has a repl factor of one. The javadoc also needs to be updated. We also should check any for callers that assume this is the EC policy ID, if you have't yet.
          • Is the PBXMLWriter change intended for this JIRA?
          Show
          andrew.wang Andrew Wang added a comment - Thanks for working on this Manoj! Overall looks good, some review comments: Noticed we now have a BlockTypeProto as well as erasureCodingPolicyID now in the INodeFile PB. The history here is that we used to have an isStriped boolean, which turned into the BlockTypeProto enum in HDFS-10759 . Now that we have this optional field, it seems like BlockTypeProto is unnecessary. Ewan Higgs , any thoughts on this? Related, I'm wondering if we should make isStriped just be return getBlockType() == STRIPED , looks equivalent to me. FSImageFormatPBINode, it'd be better to check if the policy is present with f.hasErasureCodingPolicyID rather than getting based on if the block type is striped. This way we don't accidentally get the default value of 0. INodeFile, I think having the combined replicationOrEcPolicyId parameter is confusing from the caller's perspective. It'd be better to have separate parameters that pass null or -1 if not set, and combine as appropriate in INodeFile.HeaderFormat#toLong (with requisite Precondition checks). This relates to the comment in toLong about this being hacky, I dug up this comment from Zhe during the initial review: The biggest concern is INodeFile constructor – related to that, the toLong method. Currently when isStriped, we just interpret replication as the EC policy ID. This looks pretty hacky. But it looks pretty tricky to fix. How about adding a TODO here and file a follow-on? I think what we need to do is to add some util methods to convert both 0 + replicationFactor and 1 + ecPolicy into a short typed variable, like blockLayoutRedudancy, and pass it to the constructor. Seems like this JIRA is a good time to fix the hack FSImageFormatPBInode#buildINodeFile, it looks like we unconditionally set the EC policy ID. Shouldn't we only set when there is a policy set, similar to the AclFeature and XAttrFeature? A unit test for this would be good. INodeFile#getFileReplication, IIRC S3 returns a replication factor of 1. Could you check this and use the same setting for our striped files? IMO 1 is the most reasonable value here, since each data block has a repl factor of one. The javadoc also needs to be updated. We also should check any for callers that assume this is the EC policy ID, if you have't yet. Is the PBXMLWriter change intended for this JIRA?
          Hide
          manojg Manoj Govindassamy added a comment -

          Thanks for the review Andrew Wang. Attached v02 patch which addresses the following.

          I'm wondering if we should make isStriped just be return getBlockType() == STRIPED, looks equivalent to me.

          Done.

          FSImageFormatPBINode, it'd be better to check if the policy is present with f.hasErasureCodingPolicyID..

          Makes sense. Fixed the same way.

          This relates to the comment in toLong about this being hacky, I dug up this comment from Zhe during the initial review..Seems like this JIRA is a good time to fix the hack

          Sure. Fixed the constructor in INodeFile to have distinct params for replication and ecPolicyID, fixed all the callers. Made necessary checks for the validity of the argument passed in based on the BlockType context.

          FSImageFormatPBInode#buildINodeFile, it looks like we unconditionally set the EC policy ID.

          Fixed this. Now, ec policy id is set only if the file is striped.

          INodeFile#getFileReplication, IIRC S3 returns a replication factor of 1.

          Check S3 filesystem and it does return 1 for replication. Followed the same model and made INodeFile getReplication*() to return 1.

          Is the PBXMLWriter change intended for this JIRA?

          I added few EC files in the setup in TestOfflineImageViewer so that i can test this jira fix of persisted ec policy id. After adding these EC files to FSImage, one of the XML tests in TestOfflineImageViewer started failing, complaining about the unexpected children section for the element <blocktype>. Followed similar handling done for storage policy in dumpingINode for blockType and it solved the problem. So, my changes exposed a new problem around XML processor in FSImage and its fixed.

          Show
          manojg Manoj Govindassamy added a comment - Thanks for the review Andrew Wang . Attached v02 patch which addresses the following. I'm wondering if we should make isStriped just be return getBlockType() == STRIPED, looks equivalent to me. Done. FSImageFormatPBINode, it'd be better to check if the policy is present with f.hasErasureCodingPolicyID.. Makes sense. Fixed the same way. This relates to the comment in toLong about this being hacky, I dug up this comment from Zhe during the initial review..Seems like this JIRA is a good time to fix the hack Sure. Fixed the constructor in INodeFile to have distinct params for replication and ecPolicyID, fixed all the callers. Made necessary checks for the validity of the argument passed in based on the BlockType context. FSImageFormatPBInode#buildINodeFile, it looks like we unconditionally set the EC policy ID. Fixed this. Now, ec policy id is set only if the file is striped. INodeFile#getFileReplication, IIRC S3 returns a replication factor of 1. Check S3 filesystem and it does return 1 for replication. Followed the same model and made INodeFile getReplication*() to return 1. Is the PBXMLWriter change intended for this JIRA? I added few EC files in the setup in TestOfflineImageViewer so that i can test this jira fix of persisted ec policy id. After adding these EC files to FSImage, one of the XML tests in TestOfflineImageViewer started failing, complaining about the unexpected children section for the element <blocktype>. Followed similar handling done for storage policy in dumpingINode for blockType and it solved the problem. So, my changes exposed a new problem around XML processor in FSImage and its fixed.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 22s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 6 new or modified test files.
          0 mvndep 0m 27s Maven dependency ordering for branch
          +1 mvninstall 16m 16s trunk passed
          +1 compile 2m 2s trunk passed
          +1 checkstyle 0m 46s trunk passed
          +1 mvnsite 1m 57s trunk passed
          +1 mvneclipse 0m 30s trunk passed
          +1 findbugs 3m 52s trunk passed
          +1 javadoc 1m 4s trunk passed
          0 mvndep 0m 8s Maven dependency ordering for patch
          +1 mvninstall 1m 28s the patch passed
          +1 compile 1m 31s the patch passed
          +1 cc 1m 31s the patch passed
          +1 javac 1m 31s the patch passed
          +1 checkstyle 0m 41s the patch passed
          +1 mvnsite 1m 28s the patch passed
          +1 mvneclipse 0m 23s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          -1 findbugs 2m 6s hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
          +1 javadoc 1m 2s the patch passed
          +1 unit 1m 4s hadoop-hdfs-client in the patch passed.
          +1 unit 69m 43s hadoop-hdfs in the patch passed.
          +1 asflicense 0m 26s The patch does not generate ASF License warnings.
          110m 39s



          Reason Tests
          FindBugs module:hadoop-hdfs-project/hadoop-hdfs
            Bitwise OR of signed byte value computed in org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType, Short, Byte) At INodeFile.java:value computed in org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType, Short, Byte) At INodeFile.java:[line 199]



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue HDFS-11382
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12854132/HDFS-11382.02.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc
          uname Linux 8ea709ca76f0 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 13d4bcf
          Default Java 1.8.0_121
          findbugs v3.0.0
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/18422/artifact/patchprocess/new-findbugs-hadoop-hdfs-project_hadoop-hdfs.html
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/18422/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18422/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 22s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 6 new or modified test files. 0 mvndep 0m 27s Maven dependency ordering for branch +1 mvninstall 16m 16s trunk passed +1 compile 2m 2s trunk passed +1 checkstyle 0m 46s trunk passed +1 mvnsite 1m 57s trunk passed +1 mvneclipse 0m 30s trunk passed +1 findbugs 3m 52s trunk passed +1 javadoc 1m 4s trunk passed 0 mvndep 0m 8s Maven dependency ordering for patch +1 mvninstall 1m 28s the patch passed +1 compile 1m 31s the patch passed +1 cc 1m 31s the patch passed +1 javac 1m 31s the patch passed +1 checkstyle 0m 41s the patch passed +1 mvnsite 1m 28s the patch passed +1 mvneclipse 0m 23s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. -1 findbugs 2m 6s hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) +1 javadoc 1m 2s the patch passed +1 unit 1m 4s hadoop-hdfs-client in the patch passed. +1 unit 69m 43s hadoop-hdfs in the patch passed. +1 asflicense 0m 26s The patch does not generate ASF License warnings. 110m 39s Reason Tests FindBugs module:hadoop-hdfs-project/hadoop-hdfs   Bitwise OR of signed byte value computed in org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType, Short, Byte) At INodeFile.java:value computed in org.apache.hadoop.hdfs.server.namenode.INodeFile$HeaderFormat.getBlockLayoutRedundancy(BlockType, Short, Byte) At INodeFile.java: [line 199] Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HDFS-11382 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12854132/HDFS-11382.02.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc uname Linux 8ea709ca76f0 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 13d4bcf Default Java 1.8.0_121 findbugs v3.0.0 findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/18422/artifact/patchprocess/new-findbugs-hadoop-hdfs-project_hadoop-hdfs.html Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/18422/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18422/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          manojg Manoj Govindassamy added a comment -

          The above findbugs issues are not from new code. Modified the method signature, but the bit wise operations were from the existing ones.
          The complaint is about "Bitwise OR of signed byte value". This can be an issue if the byte value being bitwise OR has 1 bit at MSB. But, the current EC Policy ID max value can only be 4 and so the above findbug issue doesn't exist. Let me know if we still need to fix this.

          Show
          manojg Manoj Govindassamy added a comment - The above findbugs issues are not from new code. Modified the method signature, but the bit wise operations were from the existing ones. The complaint is about "Bitwise OR of signed byte value". This can be an issue if the byte value being bitwise OR has 1 bit at MSB. But, the current EC Policy ID max value can only be 4 and so the above findbug issue doesn't exist. Let me know if we still need to fix this.
          Hide
          andrew.wang Andrew Wang added a comment -

          Hi Manoj, thanks for the rev! Looks like my last wave of comments were mostly addressed, a few follow-ups:

          • FSDirWriteOp#addFileForEditLog and #addFile, would prefer that we set up the parameters for newINodeFile outside of the call, like how we set up blockType. This is a little cleaner.
          • Ewan Higgs hasn't responded, but I'd like to remove blockType from the INodeFile PB definition since it looks extraneous to me, and is extra overhead. BlockType can remain as an in-memory abstraction.
          • When reading and writing the fsimage, shouldn't we set replication XOR ec policy? This is in FSImageFormatPBINode. I'd also prefer we setup the parameters outside of the call, similar to above review comments
          • FSImageFormatPBSnapshot, if we use the boxed Short and Byte, we can avoid the ternary checks right? Paranoia precondition checks are never a bad idea if you want to add some.
          • INodeFile#DEFAULT_REPL_FOR_STRIPED_BLOCKS could use a javadoc comment, we can also explain why we chose "1" for the value
          • It looks like we use getSysPoliciesMinID and MaxID to validate that the policy ID is valid; could we just check that getPolicyByPolicyID isn't null instead? I get that this is more efficient, but it's cleaner to use the helper function instead, and we can optimize the helper to use a hashtable later if this becomes a bottleneck.
          • The new INodeFile header helpers look good, except it seems like we have some boiler plate now that callers need to do around blockType. I think what Zhe wanted was for callers to use a function like your getBlockLayoutRedundancy directly, which would in turn call into the more specific getContiguous and getStriped that would be the two halves of the blockType == STRIPED if statement currently in getBlockLayoutRedundancy. Hopefully that makes sense, I can provide a code snippet if it's unclear.

          For the findbugs issue, we should add this to the hdfs findbugsExcludeFile.xml, and possibly a code comment near the signed OR for the future.

          Show
          andrew.wang Andrew Wang added a comment - Hi Manoj, thanks for the rev! Looks like my last wave of comments were mostly addressed, a few follow-ups: FSDirWriteOp#addFileForEditLog and #addFile, would prefer that we set up the parameters for newINodeFile outside of the call, like how we set up blockType . This is a little cleaner. Ewan Higgs hasn't responded, but I'd like to remove blockType from the INodeFile PB definition since it looks extraneous to me, and is extra overhead. BlockType can remain as an in-memory abstraction. When reading and writing the fsimage, shouldn't we set replication XOR ec policy? This is in FSImageFormatPBINode. I'd also prefer we setup the parameters outside of the call, similar to above review comments FSImageFormatPBSnapshot, if we use the boxed Short and Byte, we can avoid the ternary checks right? Paranoia precondition checks are never a bad idea if you want to add some. INodeFile#DEFAULT_REPL_FOR_STRIPED_BLOCKS could use a javadoc comment, we can also explain why we chose "1" for the value It looks like we use getSysPoliciesMinID and MaxID to validate that the policy ID is valid; could we just check that getPolicyByPolicyID isn't null instead? I get that this is more efficient, but it's cleaner to use the helper function instead, and we can optimize the helper to use a hashtable later if this becomes a bottleneck. The new INodeFile header helpers look good, except it seems like we have some boiler plate now that callers need to do around blockType. I think what Zhe wanted was for callers to use a function like your getBlockLayoutRedundancy directly, which would in turn call into the more specific getContiguous and getStriped that would be the two halves of the blockType == STRIPED if statement currently in getBlockLayoutRedundancy . Hopefully that makes sense, I can provide a code snippet if it's unclear. For the findbugs issue, we should add this to the hdfs findbugsExcludeFile.xml, and possibly a code comment near the signed OR for the future.
          Hide
          manojg Manoj Govindassamy added a comment -

          Thanks for the review Andrew Wang. Attached v03 patch addressing the following. Please take a look at the patch.

          would prefer that we set up the parameters for newINodeFile outside of the call, like how we set up blockType.

          Done.

          FSImageFormatPBINode...When reading and writing the fsimage, shouldn't we set replication XOR ec policy?

          Sure, set either replication or sc policy. As a side effect of this, I enhanced FSImageLoader so as to look for ecpolicy before blindly taking in replication. Needed for OIV webhdfs reader.

          INodeFile#DEFAULT_REPL_FOR_STRIPED_BLOCKS could use a javadoc comment,

          Done.

          could we just check that getPolicyByPolicyID isn't null instead?

          Done.

          what Zhe wanted was for callers to use a function like your getBlockLayoutRedundancy directly, which would in turn call into the more specific ..

          Done.

          For the findbugs issue, we should add this to the hdfs findbugsExcludeFile.xml, and possibly a code comment near the signed OR

          Done.

          but I'd like to remove blockType from the INodeFile PB definition since it looks extraneous to me

          Can I take this as part of a newer patch ? Would be a focussed patch that way.

          Show
          manojg Manoj Govindassamy added a comment - Thanks for the review Andrew Wang . Attached v03 patch addressing the following. Please take a look at the patch. would prefer that we set up the parameters for newINodeFile outside of the call, like how we set up blockType. Done. FSImageFormatPBINode...When reading and writing the fsimage, shouldn't we set replication XOR ec policy? Sure, set either replication or sc policy. As a side effect of this, I enhanced FSImageLoader so as to look for ecpolicy before blindly taking in replication. Needed for OIV webhdfs reader. INodeFile#DEFAULT_REPL_FOR_STRIPED_BLOCKS could use a javadoc comment, Done. could we just check that getPolicyByPolicyID isn't null instead? Done. what Zhe wanted was for callers to use a function like your getBlockLayoutRedundancy directly, which would in turn call into the more specific .. Done. For the findbugs issue, we should add this to the hdfs findbugsExcludeFile.xml, and possibly a code comment near the signed OR Done. but I'd like to remove blockType from the INodeFile PB definition since it looks extraneous to me Can I take this as part of a newer patch ? Would be a focussed patch that way.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 21s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 6 new or modified test files.
          0 mvndep 0m 29s Maven dependency ordering for branch
          +1 mvninstall 14m 47s trunk passed
          +1 compile 1m 52s trunk passed
          +1 checkstyle 0m 47s trunk passed
          +1 mvnsite 1m 58s trunk passed
          +1 mvneclipse 0m 30s trunk passed
          +1 findbugs 4m 9s trunk passed
          +1 javadoc 1m 15s trunk passed
          0 mvndep 0m 8s Maven dependency ordering for patch
          +1 mvninstall 1m 33s the patch passed
          +1 compile 1m 57s the patch passed
          +1 cc 1m 57s the patch passed
          +1 javac 1m 57s the patch passed
          +1 checkstyle 0m 39s the patch passed
          +1 mvnsite 1m 50s the patch passed
          +1 mvneclipse 0m 27s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 xml 0m 2s The patch has no ill-formed XML file.
          +1 findbugs 4m 28s the patch passed
          +1 javadoc 1m 10s the patch passed
          +1 unit 1m 17s hadoop-hdfs-client in the patch passed.
          -1 unit 79m 3s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 39s The patch does not generate ASF License warnings.
          121m 13s



          Reason Tests
          Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
          Timed out junit tests org.apache.hadoop.hdfs.TestLeaseRecovery2
            org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
            org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean
            org.apache.hadoop.hdfs.server.mover.TestStorageMover
            org.apache.hadoop.hdfs.server.mover.TestMover
            org.apache.hadoop.hdfs.server.datanode.TestDirectoryScanner



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue HDFS-11382
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12854642/HDFS-11382.03.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml cc
          uname Linux 0467b2c5264c 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 4a58870
          Default Java 1.8.0_121
          findbugs v3.0.0
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/18444/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/18444/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18444/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 21s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 6 new or modified test files. 0 mvndep 0m 29s Maven dependency ordering for branch +1 mvninstall 14m 47s trunk passed +1 compile 1m 52s trunk passed +1 checkstyle 0m 47s trunk passed +1 mvnsite 1m 58s trunk passed +1 mvneclipse 0m 30s trunk passed +1 findbugs 4m 9s trunk passed +1 javadoc 1m 15s trunk passed 0 mvndep 0m 8s Maven dependency ordering for patch +1 mvninstall 1m 33s the patch passed +1 compile 1m 57s the patch passed +1 cc 1m 57s the patch passed +1 javac 1m 57s the patch passed +1 checkstyle 0m 39s the patch passed +1 mvnsite 1m 50s the patch passed +1 mvneclipse 0m 27s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 xml 0m 2s The patch has no ill-formed XML file. +1 findbugs 4m 28s the patch passed +1 javadoc 1m 10s the patch passed +1 unit 1m 17s hadoop-hdfs-client in the patch passed. -1 unit 79m 3s hadoop-hdfs in the patch failed. +1 asflicense 0m 39s The patch does not generate ASF License warnings. 121m 13s Reason Tests Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure Timed out junit tests org.apache.hadoop.hdfs.TestLeaseRecovery2   org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting   org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean   org.apache.hadoop.hdfs.server.mover.TestStorageMover   org.apache.hadoop.hdfs.server.mover.TestMover   org.apache.hadoop.hdfs.server.datanode.TestDirectoryScanner Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HDFS-11382 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12854642/HDFS-11382.03.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml cc uname Linux 0467b2c5264c 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 4a58870 Default Java 1.8.0_121 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-HDFS-Build/18444/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/18444/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18444/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          ehiggs Ewan Higgs added a comment -

          Andrew Wang, the main design concern I have is whether is plays nicely with HDFS-10867 (wherein the BlockTypeProto is extended with a PROVIDED type). I think the patch works with HDFS-10867 currently it can as it's only adding a field after BlockTypeProto. However your suggestion of removing BlockTypeProto brings us back to where we were with isStriped except it would only be the incidental existence of a field to denote the block type.

          You could remove blockType but it will likely reappear in the changeset for HDFS-10867.

          Show
          ehiggs Ewan Higgs added a comment - Andrew Wang , the main design concern I have is whether is plays nicely with HDFS-10867 (wherein the BlockTypeProto is extended with a PROVIDED type). I think the patch works with HDFS-10867 currently it can as it's only adding a field after BlockTypeProto . However your suggestion of removing BlockTypeProto brings us back to where we were with isStriped except it would only be the incidental existence of a field to denote the block type. You could remove blockType but it will likely reappear in the changeset for HDFS-10867 .
          Hide
          ehiggs Ewan Higgs added a comment -

          BlockTypeProto is planned to be extended here.

          Show
          ehiggs Ewan Higgs added a comment - BlockTypeProto is planned to be extended here.
          Hide
          ehiggs Ewan Higgs added a comment -

          Is it possible to add more tests like testSaveAndLoadStripedINodeFile and/or createStripedINodeFile which create INodeFile with bad arguments and verify that it throws?

          e.g. replication != null and a set erasureCodingPolicyID.

          Show
          ehiggs Ewan Higgs added a comment - Is it possible to add more tests like testSaveAndLoadStripedINodeFile and/or createStripedINodeFile which create INodeFile with bad arguments and verify that it throws? e.g. replication != null and a set erasureCodingPolicyID.
          Hide
          manojg Manoj Govindassamy added a comment -

          Thanks for the review Ewan Higgs. Attached v04 patch which addresses the following

          • Added testStripedLayoutRedundancy in TestStripedINodeFile to test INodeFile construction for STRIPED layout redundancy, with error cases.
          • Added testContiguousLayoutRedundancy in TestINodeFile to test CONTIGUOUS layout redundancy INodeFile construction, with all error cases.
          Show
          manojg Manoj Govindassamy added a comment - Thanks for the review Ewan Higgs . Attached v04 patch which addresses the following Added testStripedLayoutRedundancy in TestStripedINodeFile to test INodeFile construction for STRIPED layout redundancy, with error cases. Added testContiguousLayoutRedundancy in TestINodeFile to test CONTIGUOUS layout redundancy INodeFile construction, with all error cases.
          Hide
          andrew.wang Andrew Wang added a comment -

          Thanks for the rev Manoj, LGTM overall, +1 pending these little nits:

          • FSDirWriteFileOp: typo "replictaionFactor" in addFileForEditLog
          • Unused import for Preconditions in INodeFileAttributes

          Ewan Higgs, my concern is that encoding whether a file is erasure coded in both the EC policy and the BlockTypeProto fields opens us up to possible incongruity between the two fields. Since I'm not proposing we do away with BlockType entirely, I double checked the Precondition checks we have in this patch, and it looks okay.

          Also as an FYI, HDFS-8030 wants to implement "contiguous EC," so we need a JIRA to rename CONTIGUOUS to REPLICATED. I filed HDFS-11465 for this if you want to pick it up, should be pretty easy to do this refactoring with IDE assistance.

          Show
          andrew.wang Andrew Wang added a comment - Thanks for the rev Manoj, LGTM overall, +1 pending these little nits: FSDirWriteFileOp: typo "replictaionFactor" in addFileForEditLog Unused import for Preconditions in INodeFileAttributes Ewan Higgs , my concern is that encoding whether a file is erasure coded in both the EC policy and the BlockTypeProto fields opens us up to possible incongruity between the two fields. Since I'm not proposing we do away with BlockType entirely, I double checked the Precondition checks we have in this patch, and it looks okay. Also as an FYI, HDFS-8030 wants to implement "contiguous EC," so we need a JIRA to rename CONTIGUOUS to REPLICATED. I filed HDFS-11465 for this if you want to pick it up, should be pretty easy to do this refactoring with IDE assistance.
          Hide
          manojg Manoj Govindassamy added a comment -

          Thanks for the review Andrew Wang. Attached v05 patch with typo and unused import fixed.

          Show
          manojg Manoj Govindassamy added a comment - Thanks for the review Andrew Wang . Attached v05 patch with typo and unused import fixed.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 18m 53s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 6 new or modified test files.
          0 mvndep 0m 28s Maven dependency ordering for branch
          +1 mvninstall 15m 4s trunk passed
          +1 compile 1m 48s trunk passed
          +1 checkstyle 0m 44s trunk passed
          +1 mvnsite 1m 56s trunk passed
          +1 mvneclipse 0m 28s trunk passed
          +1 findbugs 4m 2s trunk passed
          +1 javadoc 1m 15s trunk passed
          0 mvndep 0m 8s Maven dependency ordering for patch
          +1 mvninstall 1m 51s the patch passed
          +1 compile 1m 56s the patch passed
          +1 cc 1m 56s the patch passed
          +1 javac 1m 56s the patch passed
          +1 checkstyle 0m 45s the patch passed
          +1 mvnsite 1m 54s the patch passed
          +1 mvneclipse 0m 27s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 xml 0m 2s The patch has no ill-formed XML file.
          +1 findbugs 4m 7s the patch passed
          +1 javadoc 1m 2s the patch passed
          +1 unit 1m 2s hadoop-hdfs-client in the patch passed.
          -1 unit 74m 33s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 25s The patch does not generate ASF License warnings.
          134m 42s



          Reason Tests
          Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue HDFS-11382
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12854999/HDFS-11382.04.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml cc
          uname Linux 310b829f8d23 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 5f5b031
          Default Java 1.8.0_121
          findbugs v3.0.0
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/18459/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/18459/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18459/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 18m 53s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 6 new or modified test files. 0 mvndep 0m 28s Maven dependency ordering for branch +1 mvninstall 15m 4s trunk passed +1 compile 1m 48s trunk passed +1 checkstyle 0m 44s trunk passed +1 mvnsite 1m 56s trunk passed +1 mvneclipse 0m 28s trunk passed +1 findbugs 4m 2s trunk passed +1 javadoc 1m 15s trunk passed 0 mvndep 0m 8s Maven dependency ordering for patch +1 mvninstall 1m 51s the patch passed +1 compile 1m 56s the patch passed +1 cc 1m 56s the patch passed +1 javac 1m 56s the patch passed +1 checkstyle 0m 45s the patch passed +1 mvnsite 1m 54s the patch passed +1 mvneclipse 0m 27s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 xml 0m 2s The patch has no ill-formed XML file. +1 findbugs 4m 7s the patch passed +1 javadoc 1m 2s the patch passed +1 unit 1m 2s hadoop-hdfs-client in the patch passed. -1 unit 74m 33s hadoop-hdfs in the patch failed. +1 asflicense 0m 25s The patch does not generate ASF License warnings. 134m 42s Reason Tests Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HDFS-11382 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12854999/HDFS-11382.04.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml cc uname Linux 310b829f8d23 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 5f5b031 Default Java 1.8.0_121 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-HDFS-Build/18459/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/18459/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18459/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          manojg Manoj Govindassamy added a comment -

          Unit test failures in TestDataNodeVolumeFailure* are not related to the patch. These tests have gone flaky recently. But passing through for me locally.

          Show
          manojg Manoj Govindassamy added a comment - Unit test failures in TestDataNodeVolumeFailure* are not related to the patch. These tests have gone flaky recently. But passing through for me locally.
          Hide
          andrew.wang Andrew Wang added a comment -

          Committed to trunk, thanks for the contribution Manoj! Do you mind also adding a release note for this change, as it's incompatible?

          Show
          andrew.wang Andrew Wang added a comment - Committed to trunk, thanks for the contribution Manoj! Do you mind also adding a release note for this change, as it's incompatible?
          Hide
          manojg Manoj Govindassamy added a comment -

          Thanks for the review and commit help Andrew Wang, Ewan Higgs. Updated Release Notes for this incompatible change.

          Show
          manojg Manoj Govindassamy added a comment - Thanks for the review and commit help Andrew Wang , Ewan Higgs . Updated Release Notes for this incompatible change.
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11314 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11314/)
          HDFS-11382. Persist Erasure Coding Policy ID in a new optional field in (wang: rev 55c07bbed2f475f7b584a86112ee1b6fe0221e98)

          • (edit) hadoop-hdfs-project/hadoop-hdfs/dev-support/findbugsExcludeFile.xml
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatPBINode.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirWriteFileOp.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageLoader.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/proto/fsimage.proto
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FSImageFormatPBSnapshot.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestErasureCodingPolicies.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestStripedINodeFile.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/HdfsFileStatus.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewerWithStripedBlocks.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSImage.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFileAttributes.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeFile.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewer.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/PBImageXmlWriter.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11314 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11314/ ) HDFS-11382 . Persist Erasure Coding Policy ID in a new optional field in (wang: rev 55c07bbed2f475f7b584a86112ee1b6fe0221e98) (edit) hadoop-hdfs-project/hadoop-hdfs/dev-support/findbugsExcludeFile.xml (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatPBINode.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirWriteFileOp.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageLoader.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/proto/fsimage.proto (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FSImageFormatPBSnapshot.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestErasureCodingPolicies.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestStripedINodeFile.java (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/HdfsFileStatus.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewerWithStripedBlocks.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSImage.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFileAttributes.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeFile.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewer.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/PBImageXmlWriter.java
          Hide
          hadoopqa Hadoop QA added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 4m 26s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 6 new or modified test files.
          0 mvndep 0m 28s Maven dependency ordering for branch
          +1 mvninstall 13m 2s trunk passed
          +1 compile 1m 25s trunk passed
          +1 checkstyle 0m 40s trunk passed
          +1 mvnsite 1m 26s trunk passed
          +1 mvneclipse 0m 26s trunk passed
          +1 findbugs 3m 12s trunk passed
          +1 javadoc 0m 58s trunk passed
          0 mvndep 0m 6s Maven dependency ordering for patch
          +1 mvninstall 1m 18s the patch passed
          +1 compile 1m 20s the patch passed
          +1 cc 1m 20s the patch passed
          +1 javac 1m 20s the patch passed
          +1 checkstyle 0m 37s the patch passed
          +1 mvnsite 1m 19s the patch passed
          +1 mvneclipse 0m 21s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 xml 0m 3s The patch has no ill-formed XML file.
          +1 findbugs 3m 21s the patch passed
          +1 javadoc 0m 55s the patch passed
          +1 unit 0m 54s hadoop-hdfs-client in the patch passed.
          +1 unit 63m 48s hadoop-hdfs in the patch passed.
          +1 asflicense 0m 20s The patch does not generate ASF License warnings.
          101m 52s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue HDFS-11382
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12855021/HDFS-11382.05.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml cc
          uname Linux 748cd2698e25 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 5f5b031
          Default Java 1.8.0_121
          findbugs v3.0.0
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/18462/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18462/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 4m 26s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 6 new or modified test files. 0 mvndep 0m 28s Maven dependency ordering for branch +1 mvninstall 13m 2s trunk passed +1 compile 1m 25s trunk passed +1 checkstyle 0m 40s trunk passed +1 mvnsite 1m 26s trunk passed +1 mvneclipse 0m 26s trunk passed +1 findbugs 3m 12s trunk passed +1 javadoc 0m 58s trunk passed 0 mvndep 0m 6s Maven dependency ordering for patch +1 mvninstall 1m 18s the patch passed +1 compile 1m 20s the patch passed +1 cc 1m 20s the patch passed +1 javac 1m 20s the patch passed +1 checkstyle 0m 37s the patch passed +1 mvnsite 1m 19s the patch passed +1 mvneclipse 0m 21s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 xml 0m 3s The patch has no ill-formed XML file. +1 findbugs 3m 21s the patch passed +1 javadoc 0m 55s the patch passed +1 unit 0m 54s hadoop-hdfs-client in the patch passed. +1 unit 63m 48s hadoop-hdfs in the patch passed. +1 asflicense 0m 20s The patch does not generate ASF License warnings. 101m 52s Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HDFS-11382 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12855021/HDFS-11382.05.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml cc uname Linux 748cd2698e25 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 5f5b031 Default Java 1.8.0_121 findbugs v3.0.0 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/18462/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18462/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          ehiggs Ewan Higgs added a comment -

          Ewan Higgs, my concern is that encoding whether a file is erasure coded in both the EC policy and the BlockTypeProto fields opens us up to possible incongruity between the two fields. Since I'm not proposing we do away with BlockType entirely, I double checked the Precondition checks we have in this patch, and it looks okay.

          protobuf doesn't have sum types so the best we can do is tag a message with a type and group the fields which should be distinct in each type.

          So what we have now is:

            message INodeFile {                                                           
              optional uint32 replication = 1;                                            
              optional uint64 modificationTime = 2;                                       
              optional uint64 accessTime = 3;                                             
              optional uint64 preferredBlockSize = 4;                                     
              optional fixed64 permission = 5;                                            
              repeated BlockProto blocks = 6;                                             
              optional FileUnderConstructionFeature fileUC = 7;                           
              optional AclFeatureProto acl = 8;                                           
              optional XAttrFeatureProto xAttrs = 9;                                      
              optional uint32 storagePolicyID = 10;                                       
              optional BlockTypeProto blockType = 11;               
              optional int32 erasureCodingPolicyID = 12;                      
            }   
          

          What we might want is:

            message INodeFile {      
              message StripedFields { 
                optional int32 erasureCodingPolicyID = 1;
              }
              message  ProvidedFields { 
                optional string someField = 1;
              }
                                                               
              optional uint32 replication = 1;                                            
              optional uint64 modificationTime = 2;                                       
              optional uint64 accessTime = 3;                                             
              optional uint64 preferredBlockSize = 4;                                     
              optional fixed64 permission = 5;                                            
              repeated BlockProto blocks = 6;                                             
              optional FileUnderConstructionFeature fileUC = 7;                           
              optional AclFeatureProto acl = 8;                                           
              optional XAttrFeatureProto xAttrs = 9;                                      
              optional uint32 storagePolicyID = 10;                                       
              optional BlockTypeProto blockType = 11;   
              optional StripedFields stripedFields = 12;
              optional ProvidedFields providedFields = 13;
          

          This approach is how directories, files, and symlinks are encoded as INodes:

            message INode {                                                               
              enum Type {                                                                 
                FILE = 1;                                                                 
                DIRECTORY = 2;                                                            
                SYMLINK = 3;                                                              
              };                                                                          
              required Type type = 1;                                                     
              required uint64 id = 2;                                                     
              optional bytes name = 3;                                                    
                                                                                          
              optional INodeFile file = 4;                                                
              optional INodeDirectory directory = 5;                                      
              optional INodeSymlink symlink = 6;                                          
            }
          
          Show
          ehiggs Ewan Higgs added a comment - Ewan Higgs, my concern is that encoding whether a file is erasure coded in both the EC policy and the BlockTypeProto fields opens us up to possible incongruity between the two fields. Since I'm not proposing we do away with BlockType entirely, I double checked the Precondition checks we have in this patch, and it looks okay. protobuf doesn't have sum types so the best we can do is tag a message with a type and group the fields which should be distinct in each type. So what we have now is: message INodeFile { optional uint32 replication = 1; optional uint64 modificationTime = 2; optional uint64 accessTime = 3; optional uint64 preferredBlockSize = 4; optional fixed64 permission = 5; repeated BlockProto blocks = 6; optional FileUnderConstructionFeature fileUC = 7; optional AclFeatureProto acl = 8; optional XAttrFeatureProto xAttrs = 9; optional uint32 storagePolicyID = 10; optional BlockTypeProto blockType = 11; optional int32 erasureCodingPolicyID = 12; } What we might want is: message INodeFile { message StripedFields { optional int32 erasureCodingPolicyID = 1; } message ProvidedFields { optional string someField = 1; } optional uint32 replication = 1; optional uint64 modificationTime = 2; optional uint64 accessTime = 3; optional uint64 preferredBlockSize = 4; optional fixed64 permission = 5; repeated BlockProto blocks = 6; optional FileUnderConstructionFeature fileUC = 7; optional AclFeatureProto acl = 8; optional XAttrFeatureProto xAttrs = 9; optional uint32 storagePolicyID = 10; optional BlockTypeProto blockType = 11; optional StripedFields stripedFields = 12; optional ProvidedFields providedFields = 13; This approach is how directories, files, and symlinks are encoded as INodes: message INode { enum Type { FILE = 1; DIRECTORY = 2; SYMLINK = 3; }; required Type type = 1; required uint64 id = 2; optional bytes name = 3; optional INodeFile file = 4; optional INodeDirectory directory = 5; optional INodeSymlink symlink = 6; }

            People

            • Assignee:
              manojg Manoj Govindassamy
              Reporter:
              manojg Manoj Govindassamy
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development