Thanks Brahma Reddy Battula for patch.
Thanks Ming Ma for review,
Those are good points.
The existing blockHasEnoughRacksStriped compares getRealDataBlockNum (# of data blocks) with the number racks. But after the refactoring, it compares getRealTotalBlockNum (# of total blocks) with the number racks,
Yes, you are right. It should use getRealDataBlockNum as minracks. I found this is discussed in HDFS-7613
The current patch doesn't apply to branch-2. If you agree with the above changes, could you try out if it applies to branch-2? If it doesn't apply, you will need to provide a separate patch for branch-2 later.
Yes, of-course it will not apply. Instead of providing another branch-2 patch, which might create more conflicts when merging EC to branch-2, how about waiting for commit to branch-2. After that simple cherry-pick might work.
A general question about striped EC. It uses "# of racks >= # of data blocks" to check if a given block has enough racks. But what if "# of racks for the whole cluster < # of data blocks"? Say we use RS(6,3) and the cluster has 5 racks. The write operation will spread the 9 blocks to 5 racks and succeed. But it will fail the "enough racks" check later in BM? But that has nothing to with the refactoring work here. I just want to bring it up in case others can chime in.
You are right. Check will fail, that means block will not be removed from neededReplications map.
"# of racks >= # of data blocks" requirement is to ensure rackwise failure doesn't create any dataloss in case of EC'ed file.