Details
Description
Policy is RS-6-3-1024K, version is hadoop 3.0.2;
We suppose a file's block Index is [0,1,2,3,4,5,6,7,8], And decommission index[3,4], increase the index 6 datanode's
pendingReplicationWithoutTargets that make it large than replicationStreamsHardLimit(we set 14). Then, After the method chooseSourceDatanodes of BlockMananger, the liveBlockIndices is [0,1,2,3,4,5,7,8], Block Counter is, Live:7, Decommission:2.
In method scheduleReconstruction of BlockManager, the additionalReplRequired is 9 - 7 = 2. After Namenode choose two target Datanode, will assign a erasureCode task to target datanode.
When datanode get the task will build targetIndices from liveBlockIndices and target length. the code is blow.
// code placeholder targetIndices = new short[targets.length]; private void initTargetIndices() { BitSet bitset = reconstructor.getLiveBitSet(); int m = 0; hasValidTargets = false; for (int i = 0; i < dataBlkNum + parityBlkNum; i++) { if (!bitset.get) { if (reconstructor.getBlockLen > 0) { if (m < targets.length) { targetIndices[m++] = (short)i; hasValidTargets = true; } } } }
targetIndices[0]=6, and targetIndices[1] is aways 0 from initial value.
The StripedReader is aways create reader from first 6 index block, and is [0,1,2,3,4,5]
Use the index [0,1,2,3,4,5] to build target index[6,0] will trigger the isal bug. the block index6's data is corruption(all data is zero).
I write a unit test can stabilize repreduce.
// code placeholder private int replicationStreamsHardLimit = DFSConfigKeys.DFS_NAMENODE_REPLICATION_STREAMS_HARD_LIMIT_DEFAULT; numDNs = dataBlocks + parityBlocks + 10; @Test(timeout = 240000) public void testFileDecommission() throws Exception { LOG.info("Starting test testFileDecommission"); final Path ecFile = new Path(ecDir, "testFileDecommission"); int writeBytes = cellSize * dataBlocks; writeStripedFile(dfs, ecFile, writeBytes); Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks()); FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes); final INodeFile fileNode = cluster.getNamesystem().getFSDirectory() .getINode4Write(ecFile.toString()).asFile(); LocatedBlocks locatedBlocks = StripedFileTestUtil.getLocatedBlocks(ecFile, dfs); LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0) .get(0); DatanodeInfo[] dnLocs = lb.getLocations(); LocatedStripedBlock lastBlock = (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock(); DatanodeInfo[] storageInfos = lastBlock.getLocations(); // DatanodeDescriptor datanodeDescriptor = cluster.getNameNode().getNamesystem() .getBlockManager().getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid()); BlockInfo firstBlock = fileNode.getBlocks()[0]; DatanodeStorageInfo[] dStorageInfos = bm.getStorages(firstBlock); // the first heartbeat will consume 3 replica tasks for (int i = 0; i <= replicationStreamsHardLimit + 3; i++) { BlockManagerTestUtil.addBlockToBeReplicated(datanodeDescriptor, new Block(i), new DatanodeStorageInfo[]{dStorageInfos[0]}); } assertEquals(dataBlocks + parityBlocks, dnLocs.length); int[] decommNodeIndex = {3, 4}; final List<DatanodeInfo> decommisionNodes = new ArrayList<DatanodeInfo>(); // add the node which will be decommissioning decommisionNodes.add(dnLocs[decommNodeIndex[0]]); decommisionNodes.add(dnLocs[decommNodeIndex[1]]); decommissionNode(0, decommisionNodes, AdminStates.DECOMMISSIONED); assertEquals(decommisionNodes.size(), fsn.getNumDecomLiveDataNodes()); bm.getDatanodeManager().removeDatanode(datanodeDescriptor); //assertNull(checkFile(dfs, ecFile, 9, decommisionNodes, numDNs)); // Ensure decommissioned datanode is not automatically shutdown DFSClient client = getDfsClient(cluster.getNameNode(0), conf); assertEquals("All datanodes must be alive", numDNs, client.datanodeReport(DatanodeReportType.LIVE).length); FileChecksum fileChecksum2 = dfs.getFileChecksum(ecFile, writeBytes); Assert.assertTrue("Checksum mismatches!", fileChecksum1.equals(fileChecksum2)); StripedFileTestUtil.checkData(dfs, ecFile, writeBytes, decommisionNodes, null, blockGroupSize); }
Attachments
Attachments
Issue Links
- is related to
-
HDFS-15186 Erasure Coding: Decommission may generate the parity block's content with all 0 in some case
- Resolved
-
HDFS-16479 EC: NameNode should not send a reconstruction work when the source datanodes are insufficient
- Resolved
- relates to
-
HDFS-16497 EC: Add param comment for liveBusyBlockIndices with HDFS-14768
- Resolved