Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
2.5.0
-
None
-
None
Description
Some maintenance works (e.g., upgrading RAM or add disks) on DataNode only takes a short amount of time (e.g., 10 minutes). In these cases, the users do not want to report missing blocks on this DN because the DN will be online shortly without data lose. Thus, we need a maintenance mode for a DN so that maintenance work can be carried out on the DN without having to decommission it or the DN being marked as dead.
Attachments
Attachments
- HDFS-6729.000.patch
- 10 kB
- Lei (Eddy) Xu
- HDFS-6729.001.patch
- 10 kB
- Lei (Eddy) Xu
- HDFS-6729.002.patch
- 30 kB
- Lei (Eddy) Xu
- HDFS-6729.003.patch
- 12 kB
- Lei (Eddy) Xu
- HDFS-6729.004.patch
- 36 kB
- Lei (Eddy) Xu
- HDFS-6729.005.patch
- 39 kB
- Lei (Eddy) Xu
Issue Links
- duplicates
-
HDFS-7877 [Umbrella] Support maintenance state for datanodes
- Resolved
Activity
Hi, aw. We have cases users to upgrade RAM and/or add disks etc., which only takes a short amount of time (e.g., 10 minutes downtime). It might not need to report missing data on this DN if the users are certain that the DN will come back shortly.
Is this just to avoid the replication? If it is a 10 min downtime, the replication will have just started and stop just as fast. If it is longer than 10 mins, then you really are better off letting the system do its thing in case the node doesn't come back. There is also the problem of what to do about the NM, since it has the same problem...
By default it takes 10 and a half minutes until the NameNode starts re-replicating anything. With the stale DN feature turned on, applications trying to read from the stale node will be re-directed, so the cluster won't experience lag (or at least, not because of applications trying to contact the node under maintenance).
So I guess the question is, is it worth adding another state in case the maintenance on the datanode can't be finished in 10 minutes? On the upside, I suppose it probably wouldn't be a lot of code. It would be very similar to the stale datanode stuff we already implemented.
aw and cmccabe Thanks for looking into this issue!
We have customers encountering significant lag time between each decommissioned node (e.g., pulling data away from each other node), as described by cmccabe. This significant lag time "blew users maintenance window."
So, I am wondering whether it is possible to allow users to set a maintenance mode for DN for a given time (e.g., the user specifies the maintenance time as 1 hour), after that if the DN does not come back, NN starts the normal re-replicate process?
This patch adds support to mark a DataNode as maintenance mode, in which the DataNode can be turned off by the system administrator to upgrade it. An expiration time is set for maintenance mode, thus if NameNode does not hear the heartbeat after this DataNode expires, NN considers this DataNode dead, then the normal data recover process jumps in to replica blocks.
The CLI and configuration file supports for this function will be in another JIRA.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12663545/HDFS-6729.000.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 1 new or modified test files.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 javadoc. There were no new javadoc warning messages.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
+1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.
-1 release audit. The applied patch generated 3 release audit warnings.
-1 core tests. The test build failed in hadoop-hdfs-project/hadoop-hdfs
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7714//testReport/
Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/7714//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7714//console
This message is automatically generated.
Add a test to check the scenario that NameNode wakes up to check heartbeat in background before DataNode maintenance mode expires.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12663686/HDFS-6729.001.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 1 new or modified test files.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 javadoc. There were no new javadoc warning messages.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
+1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:
org.apache.hadoop.hdfs.TestEncryptionZones
org.apache.hadoop.hdfs.server.datanode.TestBPOfferService
org.apache.hadoop.security.TestRefreshUserMappings
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives
org.apache.hadoop.hdfs.server.namenode.TestNamenodeCapacityReport
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7725//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7725//console
This message is automatically generated.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12663844/HDFS-6729.002.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 1 new or modified test files.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 javadoc. There were no new javadoc warning messages.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
+1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:
org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7741//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7741//console
This message is automatically generated.
In this patch, the user should use CLI tool hdfs dfsadmin -maintainDatanode to enable and disable maintenance mode for DataNode, instead of using a "maintenance file" as the "exclude file" used in decommission mode, since the maintenance mode should be temporary. This way could avoid the case that system administrator runs dfsadmin -refreshNodes to accidentally extends the expiration time for DataNode.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12670046/HDFS-6729.003.patch
against trunk revision 444acf8.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 1 new or modified test files.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 javadoc. There were no new javadoc warning messages.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
+1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:
org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract
org.apache.hadoop.hdfs.server.balancer.TestBalancer
org.apache.hadoop.hdfs.server.mover.TestStorageMover
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
org.apache.hadoop.hdfs.TestEncryptionZonesWithKMS
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8113//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8113//console
This message is automatically generated.
Updated the patch to:
- add dfsadmin -setMaintenanceMode command and RPCs to NN
- change dfsadmin -report to display maintenance node information.
Hey Eddy, quick question, this looks like soft state that isn't persisted across NN restarts / failovers. Is that suitable for the target usecases?
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12694290/HDFS-6729.004.patch
against trunk revision 8f26d5a.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 2 new or modified test files.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 javadoc. There were no new javadoc warning messages.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
+1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:
org.apache.hadoop.cli.TestHDFSCLI
Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9321//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9321//console
This message is automatically generated.
Updated the patch to fix the failed test.
Hey, andrew.wang, thanks for your quick review . Yes. this maintenance mode is a soft state. NN restarts / failovers are relatively rare events. Even in NN restarts / failover scenarios, NN can treat this DN as stale / dead, which does not sacrifice the durability / availability of DN.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12694856/HDFS-6729.005.patch
against trunk revision 8bf6f0b.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified test files.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 javadoc. There were no new javadoc warning messages.
+1 eclipse:eclipse. The patch built with eclipse:eclipse.
+1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:
org.apache.hadoop.hdfs.TestLeaseRecovery2
Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9350//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9350//console
This message is automatically generated.
Eddy, thanks for the work. We didn't know about this at all until Allen pointed it out HDFS-7877. Sounds like we should combine the effort.
Maybe we can step back and discuss the design. There are couple key things we want to take care of. It will be great if you can check out the design there.
1. Admin interface. Based on our admins input, it seems "dfsadmin -refreshNodes" might be easier to use.
2. DN state machine. We define two new states for maintenance states, ENTERING_MAINTENANCE and IN_MAINTENANCE. It takes care of the case where there are no replicas on other datanodes. It also takes care of different state transition, decomm states to maintenance states.
3. Block management. We alos enforce the read and write operations when machines are in maintenance states.
Look forward to the collaboration.
What types of operations are expected to be done while the DN is in maint mode?