Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
3.0.1
-
None
-
None
Description
Recently, we could event we reserved some space(dfs.datanode.du.reserved), this disk still gets full, in turn, the performance is downgraded. After some test, we found that we need consider the space used by ext4 (we use ext4 as local fs). I.E, if the datanode reports there is 8T space available, you could not write 8T data actually. About %4-%5 space is used for the ext4 metadata. The following is the metrics of ext4 meta usage in disks with different capacity (GB, reserved block for root is set to 1%, 120MB blocks):
FileSystem Capacity | Ext4 Data | Ex4 metadata | Metadata Ratio |
3666 | 3629 | 37 | 0.0101 |
7392 | 7318 | 74 | 0.0100 |
733 | 725 | 8 | 0.0109 |
So, it seems that the ext4 most likely has the same meta data ratio with the same about data and dir tree structure.
On the other hand, in our data center, there are several disk types with different capacity. It's inefficient to set a absolute value for each datanode according to the disk capacity. To us, it makes sense to set the reservation with a ratio since we always use ext4 as the local fs. I think it's the same case for most of other data center.
Our idea is to leave the default behavior unchanged. User could still use dfs.datanode.du.reserved to set the reservation, but switch to use a ratio if some other specified param is set.
Attachments
Attachments
Issue Links
- duplicates
-
HDFS-13283 Percentage based Reserved Space Calculation for DataNode
- Resolved