[HDFS-15240] Erasure Coding: dirty buffer causes reconstruction block error - ASF JIRA

Details

Type: Bug
Status: Resolved
Priority: Blocker
Resolution: Fixed
Affects Version/s: 3.3.1, 3.4.0
Fix Version/s: 3.2.2, 3.3.1, 3.4.0
Component/s: datanode, erasure-coding
Labels:
None

Target Version/s:

3.2.2, 3.3.1, 3.4.0
Hadoop Flags:

Reviewed

Description

When read some lzo files we found some blocks were broken.

I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') blocks. And find the longest common sequenece(LCS) between b6'(decoded) and b6(read from DN)(b7'/b7 and b8'/b8).

After selecting 6 blocks of the block group in combinations one time and iterating through all cases, I find one case that the length of LCS is the block length - 64KB, 64KB is just the length of ByteBuffer used by StripedBlockReader. So the corrupt reconstruction block is made by a dirty buffer.

The following log snippet(only show 2 of 28 cases) is my check program output. In my case, I known the 3th block is corrupt, so need other 5 blocks to decode another 3 blocks, then find the 1th block's LCS substring is block length - 64kb.

It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the dirty buffer was used before read the 1th block.

Must be noted that StripedBlockReader read from the offset 0 of the 1th block after used the dirty buffer.

EDITED for readability.

decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8']
Check the first 131072 bytes between block[1] and block[1'], the longest common substring length is 4
Check the first 131072 bytes between block[6] and block[6'], the longest common substring length is 4
Check the first 131072 bytes between block[8] and block[8'], the longest common substring length is 4
decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8']
Check the first 131072 bytes between block[1] and block[1'], the longest common substring length is 65536
CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest common substring length is 27197440  # this one
Check the first 131072 bytes between block[7] and block[7'], the longest common substring length is 4
Check the first 131072 bytes between block[8] and block[8'], the longest common substring length is 4

Now I know the dirty buffer causes reconstruction block error, but how does the dirty buffer come about?

After digging into the code and DN log, I found this following DN log is the root reason.

[INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel java.nio.channels.SocketChannel[connected local=/xxxxxxxx:52586 remote=/xxxxxxxx:50010]. 180000 millis timeout left.
[WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped block: BP-714356632-xxxxxxxx-1519726836856:blk_-YYYYYYYYYYYYYY_3472979393
java.lang.NullPointerException
    at org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314)
    at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308)
    at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269)
    at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94)
    at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:834)

Reading from DN may timeout(hold by a future(F)) and output the INFO log, but the futures that contains the future(F) is cleared,

return new StripingChunkReadResult(futures.remove(future),
    StripingChunkReadResult.CANCELLED);

futures.remove(future) cause NPE. So the EC reconstruction is failed. In the finally phase, the code snippet in getStripedReader().close()

reconstructor.freeBuffer(reader.getReadBuffer());
reader.freeReadBuffer();
reader.closeBlockReader();

free buffer firstly, but the StripedBlockReader still holds the buffer and write it, that pollute the buffer of BufferPool.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

test-HDFS-15240.006.patch
30/Oct/20 11:37
14 kB
Toshihiko Uchida
org.apache.hadoop.hdfs.TestReconstructStripedFile-output.txt
30/Oct/20 11:38
24.48 MB
Toshihiko Uchida
org.apache.hadoop.hdfs.TestReconstructStripedFile.txt
30/Oct/20 11:38
5 kB
Toshihiko Uchida
image-2020-07-16-15-56-38-608.png
16/Jul/20 07:57
551 kB
Hongbing Wang
HDFS-15240-branch-3.3-001.patch
04/Dec/20 19:08
19 kB
HuangTao
HDFS-15240-branch-3.3.001.patch
07/Dec/20 02:57
19 kB
Hui Fei
HDFS-15240-branch-3.2.001.patch
04/Dec/20 09:29
19 kB
Xiaoqiao He
HDFS-15240-branch-3.1-001.patch
04/Dec/20 19:08
20 kB
HuangTao
HDFS-15240-branch-3.1.001.patch
07/Dec/20 02:57
20 kB
Hui Fei
HDFS-15240.013.patch
02/Dec/20 14:18
19 kB
HuangTao
HDFS-15240.012.patch
30/Nov/20 02:30
19 kB
HuangTao
HDFS-15240.011.patch
28/Nov/20 14:43
19 kB
HuangTao
HDFS-15240.010.patch
27/Nov/20 07:00
19 kB
HuangTao
HDFS-15240.009.patch
26/Nov/20 06:34
18 kB
HuangTao
HDFS-15240.008.patch
16/Nov/20 08:29
19 kB
HuangTao
HDFS-15240.007.patch
16/Nov/20 02:32
19 kB
HuangTao
HDFS-15240.006.patch
13/Sep/20 00:07
17 kB
HuangTao
HDFS-15240.005.patch
05/Apr/20 10:57
17 kB
HuangTao
HDFS-15240.004.patch
05/Apr/20 05:32
17 kB
HuangTao
HDFS-15240.003.patch
04/Apr/20 04:03
16 kB
HuangTao
HDFS-15240.002.patch
03/Apr/20 18:41
16 kB
HuangTao
HDFS-15240.001.patch
26/Mar/20 09:41
3 kB
HuangTao

Erasure Coding: dirty buffer causes reconstruction block error

Details

Description

Attachments

Attachments

Activity

People

Dates