Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Duplicate
-
2.3.0
-
None
Description
When we run datanode from the machine with big disk volume, it's found du operations from org.apache.hadoop.fs.DU's DURefreshThread cost lots of disk performance.
As we use the whole disk for hdfs storage, it is possible calculate volume usage via "df" command. Is it necessary adding the "df" option for usage calculation in hdfs (org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice)?
Attachments
Attachments
Issue Links
- duplicates
-
HADOOP-12974 Create a CachingGetSpaceUsed implementation that uses df
- Resolved
- is related to
-
HDFS-8791 block ID-based DN storage layout can be very slow for datanode on ext4
- Resolved