Thanks for the detailed writeup Sammi!
Andrew, can you explain why you think for a directory, the replication policy should be return when getErasureCodingPolicy is called?
I was thinking of a usecase where a user wants to redo the policies on an directory tree. Unless they can distinguish between states 1 and 2 vs 3 via a get API, they need to call set/remove on every directory to get exactly what they want. Another usecase is distcp, where you might want to exactly replicate the same storage policy setup on a destination cluster.
Looking at StoragePolicy though, it just returns the inherited policy. I don't see a way to check if a policy is inherited or explicitly set. IMO this is a flaw (particularly for distcp), but it's better to follow suit for continuity. It's also less bad for EC since there's no way to change the EC policy for a file.
Also referencing StoragePolicy, there's the idea of a default storage policy for the cluster. This is hardcoded to HOT, and is returned when you call getStoragePolicy. To align with getStoragePolicy, arguably getECPolicy should return a special "replicated" ECPolicy, but that makes isErasureCoded checks more complicated.
So, all said, let's just return the inherited EC policy if it's not replicated. We also need to validate the isErasureCoded checks internally, since I know for instance we restrict the set of storage policies that work on erasure coded files.
Unless we introduce another policy manipulation API, such as "setDefaultReplicationPolicy" which handles change directory from ec policy to replication policy.
I like this idea, since calling the replication policy an "ECPolicy" is a misnomer. It's also confusing if we make people set it via setECPolicy but don't return it in getECPolicy.