Having variable replication (
HADOOP-51) it is natural to be able to
change replication for existing files. This patch introduces the functionality.
Here is a detailed list of issues addressed by the patch.
1) setReplication() and getReplication() methods are implemented.
2) DFSShell prints file replication for any listed file.
3) Bug fix. FSDirectory.delete() logs delete operation even if it is not successful.
4) Bug fix. This is a distributed bug.
Suppose that file replication is 3, and a client reduces it to 1.
Two data nodes will be chosen to remove their copies, and will do that.
After a while they will report to the name node that the copies have been actually deleted.
Until they report the name node assumes the copies still exist.
Now the client decides to increase replication back to 3 BEFORE the data nodes
reported the copies are deleted. Then the name node can choose one of the data nodes,
which it thinks have a block copy, to replicate the block to new data nodes.
This setting is quite unusual but possible even without variable replications.
5) Logging for name and data nodes is improved in several cases.
E.g. data nodes never logged that they deleted a block.