Details
-
Task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
In method isLazyPersist(), the org.apache.hadoop.hdfs.DataStreamer class checks whether the HDFS file is lazy persist. It does two things:
1. Create a class-wide static BlockStoragePolicySuite object, which builds an array of BlockStoragePolicy internally
2. Get a block storage policy object from the blockStoragePolicySuite by policy name HdfsConstants.MEMORY_STORAGE_POLICY_NAME
This has two side effects:
1. Takes time to iterate the pre-built block storage policy array in order to find the same policy every time whose id matters only (as we need to compare the file status policy id with lazy persist policy id)
2. DataStreamer class imports BlockStoragePolicySuite. The former should be moved to hadoop-hdfs-client module, while the latter can stay in hadoop-hdfs module.
Actually, we have the block storage policy IDs, which can be used to compare with HDFS file status' policy id, as following:
static boolean isLazyPersist(HdfsFileStatus stat) { return stat.getStoragePolicy() == HdfsConstants.MEMORY_STORAGE_POLICY_ID; }
This way, we only need to move the block storage policies' IDs from HdfsServerConstant (hadoop-hdfs module) to HdfsConstants (hadoop-hdfs-client module).
Another reason we should move those block storage policy IDs is that the block storage policy names were moved to HdfsConstants already.