> Block is an existing base class and it is there for a very long time. We cannot simply view it in one way yesterday and view it in another way today.
I agree that Block is a long existing base class. But I disagree that we could not do any change to it. Adding generation stamp as a part of its key was introduced by the append project. As I think more on the new append design, I feel that this was a design flaw and has caused many problems that we could not handle multiple generation stamps. That's why I want to make the proposed change.
As I said in the description of this jira, this change is based on the following facts: (1) On each datanode only one replica of block exists. Therefore there is only one generation of a block. (2) NameNode has only one entry for a block in its blocks map.
So in all the maps that you mentioned, there should be only one entry of a block per block id. In either NN or DN, there is only one entry of blockInfo or replicaInfo per block. I do not think changing the key of the Block should cause any problem to all these data structures in dfs. If there are two generations of a block in those data structures, this is an error. Whether the key contains generation stamp or not, dfs should handle this error.
I agree this may break some external applications. We could put this change in the release note.