Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.4.0
-
None
Description
Presently, dfs.namenode.file.close.num-committed-allowed is ignored in case of EC blocks. But in case of heavy loads, IBR's from Datanode may get delayed and cause the file write to fail. So, can allow EC files to close with blocks in committed state as REP files
Attachments
Attachments
- HDFS-15359-01.patch
- 5 kB
- Ayush Saxena
- HDFS-15359-02.patch
- 3 kB
- Ayush Saxena
- HDFS-15359-03.patch
- 5 kB
- Ayush Saxena
- HDFS-15359-04.patch
- 7 kB
- Ayush Saxena
- HDFS-15359-05.patch
- 8 kB
- Ayush Saxena
Issue Links
- is related to
-
HDFS-15315 IOException on close() when using Erasure Coding
- Open
-
HDFS-8999 Allow a file to be closed with COMMITTED but not yet COMPLETE blocks.
- Resolved
- relates to
-
HDFS-8999 Allow a file to be closed with COMMITTED but not yet COMPLETE blocks.
- Resolved
Activity
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 36s | Docker mode activated. |
Prechecks | |||
+1 | dupname | 0m 0s | No case conflicting files found. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. |
trunk Compile Tests | |||
-1 | mvninstall | 28m 47s | root in trunk failed. |
+1 | compile | 1m 7s | trunk passed |
+1 | checkstyle | 0m 48s | trunk passed |
+1 | mvnsite | 1m 15s | trunk passed |
-1 | shadedclient | 17m 48s | branch has errors when building and testing our client artifacts. |
+1 | javadoc | 0m 42s | trunk passed |
0 | spotbugs | 3m 7s | Used deprecated FindBugs config; considering switching to SpotBugs. |
+1 | findbugs | 3m 5s | trunk passed |
Patch Compile Tests | |||
+1 | mvninstall | 1m 8s | the patch passed |
+1 | compile | 1m 2s | the patch passed |
+1 | javac | 1m 2s | the patch passed |
+1 | checkstyle | 0m 40s | the patch passed |
+1 | mvnsite | 1m 9s | the patch passed |
+1 | whitespace | 0m 0s | The patch has no whitespace issues. |
-1 | shadedclient | 16m 4s | patch has errors when building and testing our client artifacts. |
+1 | javadoc | 0m 37s | the patch passed |
+1 | findbugs | 3m 5s | the patch passed |
Other Tests | |||
-1 | unit | 93m 40s | hadoop-hdfs in the patch passed. |
+1 | asflicense | 0m 35s | The patch does not generate ASF License warnings. |
172m 12s |
Reason | Tests |
---|---|
Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics | |
hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
Subsystem | Report/Notes |
---|---|
Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://builds.apache.org/job/PreCommit-HDFS-Build/29310/artifact/out/Dockerfile |
JIRA Issue | |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13003151/HDFS-15359-02.patch |
Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle |
uname | Linux 20534c52ea98 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
Build tool | maven |
Personality | personality/hadoop.sh |
git revision | trunk / 178336f8a8b |
Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 |
mvninstall | https://builds.apache.org/job/PreCommit-HDFS-Build/29310/artifact/out/branch-mvninstall-root.txt |
unit | https://builds.apache.org/job/PreCommit-HDFS-Build/29310/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt |
Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/29310/testReport/ |
Max. process+thread count | 3530 (vs. ulimit of 5500) |
modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/29310/console |
versions | git=2.17.1 maven=3.6.0 findbugs=3.1.0-RC1 |
Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 28m 33s | Docker mode activated. |
Prechecks | |||
+1 | dupname | 0m 1s | No case conflicting files found. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. |
trunk Compile Tests | |||
-1 | mvninstall | 22m 1s | root in trunk failed. |
+1 | compile | 1m 8s | trunk passed |
+1 | checkstyle | 0m 48s | trunk passed |
+1 | mvnsite | 1m 14s | trunk passed |
-1 | shadedclient | 17m 47s | branch has errors when building and testing our client artifacts. |
+1 | javadoc | 0m 42s | trunk passed |
0 | spotbugs | 3m 6s | Used deprecated FindBugs config; considering switching to SpotBugs. |
+1 | findbugs | 3m 4s | trunk passed |
Patch Compile Tests | |||
+1 | mvninstall | 1m 8s | the patch passed |
+1 | compile | 1m 3s | the patch passed |
+1 | javac | 1m 3s | the patch passed |
+1 | checkstyle | 0m 44s | the patch passed |
+1 | mvnsite | 1m 9s | the patch passed |
+1 | whitespace | 0m 0s | The patch has no whitespace issues. |
-1 | shadedclient | 16m 2s | patch has errors when building and testing our client artifacts. |
+1 | javadoc | 0m 37s | the patch passed |
+1 | findbugs | 3m 6s | the patch passed |
Other Tests | |||
-1 | unit | 115m 4s | hadoop-hdfs in the patch passed. |
+1 | asflicense | 0m 39s | The patch does not generate ASF License warnings. |
214m 59s |
Reason | Tests |
---|---|
Failed junit tests | hadoop.hdfs.server.datanode.TestBPOfferService |
hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics |
Subsystem | Report/Notes |
---|---|
Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://builds.apache.org/job/PreCommit-HDFS-Build/29311/artifact/out/Dockerfile |
JIRA Issue | |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13003161/HDFS-15359-03.patch |
Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle |
uname | Linux 6c27f7ed5a70 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
Build tool | maven |
Personality | personality/hadoop.sh |
git revision | trunk / 6e416a83d1e |
Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 |
mvninstall | https://builds.apache.org/job/PreCommit-HDFS-Build/29311/artifact/out/branch-mvninstall-root.txt |
unit | https://builds.apache.org/job/PreCommit-HDFS-Build/29311/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt |
Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/29311/testReport/ |
Max. process+thread count | 2838 (vs. ulimit of 5500) |
modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/29311/console |
versions | git=2.17.1 maven=3.6.0 findbugs=3.1.0-RC1 |
Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
Not familiar with this configuration but I suppose szetszwo may have some comments.
What is the semantics of this configuration for EC blocks? Would you like to update hdfs-default.xml and update the description for the case of EC?
Thanx weichiu
I have updated the patch, with details in hdfs-default.xml as well.
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 1m 12s | Docker mode activated. |
Prechecks | |||
+1 | dupname | 0m 0s | No case conflicting files found. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. |
trunk Compile Tests | |||
+1 | mvninstall | 20m 28s | trunk passed |
+1 | compile | 1m 12s | trunk passed |
+1 | checkstyle | 0m 49s | trunk passed |
+1 | mvnsite | 1m 21s | trunk passed |
+1 | shadedclient | 15m 41s | branch has no errors when building and testing our client artifacts. |
+1 | javadoc | 0m 43s | trunk passed |
0 | spotbugs | 3m 5s | Used deprecated FindBugs config; considering switching to SpotBugs. |
+1 | findbugs | 3m 2s | trunk passed |
Patch Compile Tests | |||
+1 | mvninstall | 1m 13s | the patch passed |
+1 | compile | 1m 7s | the patch passed |
+1 | javac | 1m 7s | the patch passed |
+1 | checkstyle | 0m 43s | the patch passed |
+1 | mvnsite | 1m 14s | the patch passed |
+1 | whitespace | 0m 0s | The patch has no whitespace issues. |
+1 | xml | 0m 1s | The patch has no ill-formed XML file. |
+1 | shadedclient | 13m 56s | patch has no errors when building and testing our client artifacts. |
+1 | javadoc | 0m 41s | the patch passed |
+1 | findbugs | 3m 3s | the patch passed |
Other Tests | |||
-1 | unit | 108m 42s | hadoop-hdfs in the patch passed. |
+1 | asflicense | 0m 52s | The patch does not generate ASF License warnings. |
176m 13s |
Reason | Tests |
---|---|
Failed junit tests | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics | |
hadoop.hdfs.protocol.datatransfer.sasl.TestSaslDataTransfer | |
hadoop.hdfs.TestSafeModeWithStripedFileWithRandomECPolicy | |
hadoop.hdfs.TestDecommission | |
hadoop.hdfs.TestFileChecksumCompositeCrc | |
hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes | |
hadoop.hdfs.server.namenode.ha.TestMultiObserverNode | |
hadoop.hdfs.TestEncryptedTransfer | |
hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots | |
hadoop.hdfs.TestReconstructStripedFile | |
hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA | |
hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup | |
hadoop.hdfs.client.impl.TestBlockReaderLocal | |
hadoop.hdfs.client.impl.TestBlockReaderFactory |
Subsystem | Report/Notes |
---|---|
Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://builds.apache.org/job/PreCommit-HDFS-Build/29325/artifact/out/Dockerfile |
JIRA Issue | |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13003344/HDFS-15359-04.patch |
Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml |
uname | Linux ad864f7114bc 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
Build tool | maven |
Personality | personality/hadoop.sh |
git revision | trunk / bdbd59cfa09 |
Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 |
unit | https://builds.apache.org/job/PreCommit-HDFS-Build/29325/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt |
Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/29325/testReport/ |
Max. process+thread count | 3784 (vs. ulimit of 5500) |
modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/29325/console |
versions | git=2.17.1 maven=3.6.0 findbugs=3.1.0-RC1 |
Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
Test failures due to java.lang.OutOfMemoryError: unable to create new native thread, seems not related.
What does the test check? Looks to me the added test verifies two scenarios, but what is being checked in the first case? I think I was expecting an assertion somewhere but maybe that's not needed?
Yes, the test verifies two scenarios :
First Scenario : In case the Last Block is in COMMITTED State with dfs.namenode.file.close.num-committed-allowed configured more than 0, the file complete should be success. if dfs.namenode.file.close.num-committed-allowed is made 0 and fix is removed, Will get the same error as HDFS-15315
Second Scenario: In case in the above scenario, if the last blockgroup isn't complete, (One DN is down among 3 DN's) in that case The above logic shouldn't pitch in and allow to close COMMITTED files. To prevent data loss, in extreme cases..
Thanks ayushtkn for the patch.
I think the approach of allowing commited block only in case of write happened to all nodes is very reasonable to prevent unexpected dataloss.
2 minor comments
if (b.isStriped()) { BlockInfoStriped blkStriped = (BlockInfoStriped) b; if (b.getUnderConstructionFeature().getExpectedStorageLocations().length != blkStriped.getRealTotalBlockNum()) { return b + " is a striped block in " + state + " with less then " + "required number of blocks."; } }
Move this check after `if (state != BlockUCState.COMMITTED) ` check. It makes more sense there.
In test,
// Check if the blockgroup isn't complete then file close shouldn't be // success with block in committed state. cluster.getDataNodes().get(0).shutdown(); FSDataOutputStream str = dfs.create(new Path("/dir/file1")); for (int i = 0; i < 1024 * 1024 * 4; i++) { str.write(i); } DataNodeTestUtils.pauseIBR(cluster.getDataNodes().get(0)); DataNodeTestUtils.pauseIBR(cluster.getDataNodes().get(1)); LambdaTestUtils.intercept(IOException.class, "", () -> str.close());
You should `pauseIBR` datanodes 1 and 2. 0 is already shutdown.
+1 once addessed.
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 2m 41s | Docker mode activated. |
Prechecks | |||
+1 | dupname | 0m 0s | No case conflicting files found. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. |
trunk Compile Tests | |||
+1 | mvninstall | 25m 59s | trunk passed |
+1 | compile | 1m 31s | trunk passed |
+1 | checkstyle | 1m 1s | trunk passed |
+1 | mvnsite | 1m 36s | trunk passed |
+1 | shadedclient | 21m 14s | branch has no errors when building and testing our client artifacts. |
+1 | javadoc | 0m 56s | trunk passed |
0 | spotbugs | 3m 54s | Used deprecated FindBugs config; considering switching to SpotBugs. |
+1 | findbugs | 3m 53s | trunk passed |
Patch Compile Tests | |||
+1 | mvninstall | 1m 26s | the patch passed |
+1 | compile | 1m 22s | the patch passed |
+1 | javac | 1m 22s | the patch passed |
+1 | checkstyle | 0m 55s | the patch passed |
+1 | mvnsite | 1m 31s | the patch passed |
+1 | whitespace | 0m 0s | The patch has no whitespace issues. |
+1 | xml | 0m 2s | The patch has no ill-formed XML file. |
+1 | shadedclient | 17m 1s | patch has no errors when building and testing our client artifacts. |
+1 | javadoc | 0m 39s | the patch passed |
+1 | findbugs | 3m 4s | the patch passed |
Other Tests | |||
-1 | unit | 106m 48s | hadoop-hdfs in the patch passed. |
+1 | asflicense | 0m 34s | The patch does not generate ASF License warnings. |
191m 49s |
Reason | Tests |
---|---|
Failed junit tests | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy |
hadoop.hdfs.TestReconstructStripedFile | |
hadoop.hdfs.TestStripedFileAppend | |
hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier |
Subsystem | Report/Notes |
---|---|
Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://builds.apache.org/job/PreCommit-HDFS-Build/29387/artifact/out/Dockerfile |
JIRA Issue | |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13004440/HDFS-15359-05.patch |
Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml |
uname | Linux 3b613e349bfe 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
Build tool | maven |
Personality | personality/hadoop.sh |
git revision | trunk / 19f26a020e2 |
Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 |
unit | https://builds.apache.org/job/PreCommit-HDFS-Build/29387/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt |
Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/29387/testReport/ |
Max. process+thread count | 2815 (vs. ulimit of 5500) |
modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/29387/console |
versions | git=2.17.1 maven=3.6.0 findbugs=3.1.0-RC1 |
Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
Test Failures not related.
the Xmits ones failed in other builds too and TestStripedFileAppend passed locally.
Thanx vinayakumarb for the review.
weichiu will be holding this for a day, Let me know if you tend to take a look and busy, will further hold committing this.
Committed to trunk.
Thanx vinayakumarb and weichiu for the reviews!!!
SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18331 (See https://builds.apache.org/job/Hadoop-trunk-Commit/18331/)
HDFS-15359. EC: Allow closing a file with committed blocks. Contributed (ayushsaxena: rev 2326123705445dee534ac2c298038831b5d04a0a)
- (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java
- (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
- (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java
- (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
HDFS-15359This message was automatically generated.