Description
As disks are getting larger and more plentiful, we're seeing DNs with multiple millions of blocks on a single machine. When page cache space is tight, block reports can take multiple minutes to generate. Currently, during the scanning of the data directories to generate a report, the FSVolumeSet lock is held. This causes writes and reads to block, timeout, etc, causing big problems especially for clients like HBase.
This JIRA is to explore some of the ideas originally discussed in HADOOP-4584 for the 0.20.20x series.
Attachments
Attachments
Issue Links
- is duplicated by
-
HDFS-159 Block reports should be processed offline
- Resolved
- is related to
-
HDFS-2384 Improve speed of block report generation
- Open
- relates to
-
HDFS-1280 SocketTimeoutException during HDFS Block report generation
- Open
-
HDFS-2282 Semi-harmless race between block reports and block invalidation
- Resolved
-
HADOOP-4584 Slow generation of blockReport at DataNode causes delay of sending heartbeat to NameNode
- Closed