[HDFS-12487] FsDatasetSpi.isValidBlock() lacks null pointer check inside and neither do the callers - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.0.0
Fix Version/s: 3.3.0, 3.2.1, 3.1.3
Component/s: balancer & mover, diskbalancer
Labels:
None
Environment:

CentOS 6.8 x64
CPU:4 core
Memory:16GB
Hadoop: Release 3.0.0-alpha4

Target Version/s:

3.3.0

Description

BlockIteratorImpl.nextBlock() will look for the blocks in the source volume, if there are no blocks any more, it will return null up to DiskBalancer.getBlockToCopy(). However, the DiskBalancer.getBlockToCopy() will check whether it's a valid block.
When I look into the FsDatasetSpi.isValidBlock(), I find that it doesn't check the null pointer! In fact, we firstly need to check whether it's null or not, or exception will occur.
This bug is hard to find, because the DiskBalancer hardly copy all the data of one volume to others. Even if some times we may copy all the data of one volume to other volumes, when the bug occurs, the copy process has already done.
However, when we try to copy all the data of two or more volumes to other volumes in more than one step, the thread will be shut down, which is caused by the bug above.
The bug can fixed by two ways:
1)Before the call of FsDatasetSpi.isValidBlock(), we check the null pointer
2)Check the null pointer inside the implementation of FsDatasetSpi.isValidBlock()

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HDFS-12487.002.patch
21/Sep/17 06:59
0.9 kB
liumi
HDFS-12487.003.patch
22/Sep/17 01:44
1.0 kB
liumi

Issue Links

breaks

HDFS-14599 HDFS-12487 breaks test TestDiskBalancer.testDiskBalancerWithFedClusterWithOneNameServiceEmpty

Resolved

Activity

People

Assignee:: liumi

Reporter:: liumi

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 19/Sep/17 07:25

Updated:: 20/May/20 08:24

Resolved:: 22/Jun/19 01:19