Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
2.7.0
-
None
-
None
-
None
Description
Decommission a datanode from hadoop, and hdfs can calculate the correct number of blocks to be replicated from web-ui.
Decomissioning Node Last contact Under replicated blocks Blocks with no live replicas Under Replicated Blocks In files under construction TS-BHTEST-03:50010 (172.22.49.3:50010) 25641 0 0
From NN's log, the work of block replicating cannot be enforced due to inconsistent expected storage type.
Node /default/rack_02/172.22.49.5:50010 [ Storage [DISK]DS-3915533b-4ae4-4806-bf83caf1446f1e2f:NORMAL:172.22.49.5:50010 is not chosen since storage types do not match, where the required storage type is ARCHIVE. Storage [DISK]DS-3e54c331-3eaf-4447-b5e4-9bf91bc71b17:NORMAL:172.22.49.5:50010 is not chosen since storage types do not match, where the required storage type is ARCHIVE. Storage [DISK]DS-d44fa611-aa73-4415-a2de-7e73c9c5ea68:NORMAL:172.22.49.5:50010 is not chosen since storage types do not match, where the required storage type is ARCHIVE. Storage [DISK]DS-cebbf410-06a0-4171-a9bd-d0db55dad6d3:NORMAL:172.22.49.5:50010 is not chosen since storage types do not match, where the required storage type is ARCHIVE. Storage [DISK]DS-4c50b1c7-eaad-4858-b476-99dec17d68b5:NORMAL:172.22.49.5:50010 is not chosen since storage types do not match, where the required storage type is ARCHIVE. Storage [DISK]DS-f6cf9123-4125-4234-8e21-34b12170e576:NORMAL:172.22.49.5:50010 is not chosen since storage types do not match, where the required storage type is ARCHIVE. Storage [DISK]DS-7601b634-1761-45cc-9ffd-73ee8687c2a7:NORMAL:172.22.49.5:50010 is not chosen since storage types do not match, where the required storage type is ARCHIVE. Storage [DISK]DS-1d4b91ab-fe2f-4d5f-bd0a-57e9a0714654:NORMAL:172.22.49.5:50010 is not chosen since storage types do not match, where the required storage type is ARCHIVE. Storage [DISK]DS-cd2279cf-9c5a-4380-8c41-7681fa688eaf:NORMAL:172.22.49.5:50010 is not chosen since storage types do not match, where the required storage type is ARCHIVE. Storage [DISK]DS-630c734f-334a-466d-9649-4818d6e91181:NORMAL:172.22.49.5:50010 is not chosen since storage types do not match, where the required storage type is ARCHIVE. Storage [DISK]DS-31cd0d68-5f7c-4a0a-91e6-afa53c4df820:NORMAL:172.22.49.5:50010 is not chosen since storage types do not match, where the required storage type is ARCHIVE. ] 2015-07-07 16:00:22,032 WARN org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough replicas: expected size is 1 but onl y 0 storage types can be selected (replication=3, selected=[], unavailable=[DISK, ARCHIVE], removed=[DISK], policy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}) 2015-07-07 16:00:22,032 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in n eed of 1 to reach 3 (unavailableStorages=[DISK, ARCHIVE], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=false) All required storage types are unavailable: unavailableStorages=[DISK, ARCHIVE], storageP olicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
We have upgraded the hadoop cluster from 2.5 to 2.7.0 previously. I believe the feature of ARCHIVE STORAGE has been enforced, but how about the block's storage type after upgrading?
The default BlockStoragePolicy is hot, and I guess those blocks do not have the correct information bit of BlockStoragePolicy, so it cannot be handled well.
After I shutdown the datanode, the under-replicated blocks can be asked to copy. So the workaround is to shutdown the datanode.
Could anyone take a look at the issue?
Attachments
Issue Links
- duplicates
-
HDFS-10453 ReplicationMonitor thread could stuck for long time due to the race between replication and delete of same file in a large cluster.
- Resolved