Description
The invalidate(String bpid, Block[] invalidBlks) method in SimulatedFSDataset.java should remove all invalidBlks from the simulated file system. It currently fails to do that.
Attachments
Attachments
- HDFS-6799.patch
- 1 kB
- Megasthenis Asteris
Activity
Can you please make megas a contributor so that he can assign this to himself ?
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12659050/HDFS-6799.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 1 new or modified test files.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 javadoc. There were no new javadoc warning messages.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
+1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7526//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7526//console
This message is automatically generated.
unfinalizeBlock in SimulatedFSDataset needs to be fixed as well.
unfinalizeBlock indeed needs to be fixed, but I am not sure what the expected behavior should be.
The way it is structured now, it seems that it would suffice to delete the block from the SimulatedFSDataset's map of blocks. However, note that this is not exactly the opposite of finilizeBlock as one might expect.
Also, I realized that TestSimulatedFSDataset also has bugs:
checkInvalidBlock(ExtendedBlock b) creates a new simulated dataset every time it is called. Clearly, b will not be in the new dataset and checkInvalidBlock practically always assumes that b is indeed an invalid block. Should I submit this as a separate bug, or fix it here?
The block removal issues in SimulatedFSDataset are related to the introduction of block pools.
Could you please fix the unfinalizeBlock so that block is removed from the map inside blockMap ?
Please open a new issue for the bugs in TestSimulatedFSDataset.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12668628/HDFS-6799.patch
against trunk revision 98588cf.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 1 new or modified test files.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 javadoc. There were no new javadoc warning messages.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
+1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8017//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8017//console
This message is automatically generated.
+1. I intend to commit this on monday if there are no further comments.
SUCCESS: Integrated in Hadoop-Yarn-trunk #683 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/683/)
HDFS-6799. The invalidate method in SimulatedFSDataset failed to remove (invalidate) blocks from the file system. Contributed by Megasthenis Asteris. (benoy: rev 02adf7185de626492bba2b3718959457e958a7be)
- hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
FAILURE: Integrated in Hadoop-Mapreduce-trunk #1899 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1899/)
HDFS-6799. The invalidate method in SimulatedFSDataset failed to remove (invalidate) blocks from the file system. Contributed by Megasthenis Asteris. (benoy: rev 02adf7185de626492bba2b3718959457e958a7be)
- hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
FAILURE: Integrated in Hadoop-Hdfs-trunk #1874 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1874/)
HDFS-6799. The invalidate method in SimulatedFSDataset failed to remove (invalidate) blocks from the file system. Contributed by Megasthenis Asteris. (benoy: rev 02adf7185de626492bba2b3718959457e958a7be)
- hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
TestBalancer.testUnknownDatanode failed in build #8162. It might be related to the change here since there were a lot of ReplicaNotFoundException resulted from SimulatedFSDataset. For example,
2014-09-23 11:20:05,974 ERROR datanode.DataNode (DataXceiver.java:run(243)) - host1.foo.com:47137:DataXceiver error processing COPY_BLOCK operation src: /127.0.0.1:36294 dst: /127.0.0.1:47137 org.apache.hadoop.hdfs.server.datanode.ReplicaNotFoundException: Replica not found for BP-1049218722-67.195.81.148-1411471192218:blk_1073741850_1026 at org.apache.hadoop.hdfs.server.datanode.BlockSender.getReplica(BlockSender.java:419) at org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:228) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.copyBlock(DataXceiver.java:918) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opCopyBlock(Receiver.java:241) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:80) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:225) at java.lang.Thread.run(Thread.java:662)
Thanks Benoy. The patch here was correct. There might be more bugs in SimulatedFSDataset or TestBalancer.testUnknownDatanode. The fix here simply triggered them.
szetszwo, I am not able to simulate the error locally.
Running org.apache.hadoop.hdfs.server.balancer.TestBalancer
Tests run: 21, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 201.457 sec - in org.apache.hadoop.hdfs.server.balancer.TestBalancer
I also went through other builds randomly, but doesn't see any more failures.
benoyantony, thanks for checking it. Let's ignore it for now and see if the test fail again in the future.
Good catch , megas.
Is there a way to add a unit test for this ?
arpitagarwal, szetszwo, could you please take a look ?