This is a third proposal to solve the problem described in
The problem is, when we do distcp from one cluster to another (or within the same cluster), in addition to copying file data, we copy the metadata from source to target. If external attribute provider is enabled, the metadata may be read from the provider, thus provider data read from source may be saved to target HDFS.
We want to avoid saving metadata from external provider to HDFS, so we want to bypass external provider when doing the distcp (or hadoop fs -cp) operation.
The idea is, we introduce a new config, that specifies a special user (or a list of users), and let NN bypass external provider when the current user is a special user.
If we run applications as the special user that need data from external attribute provider, then it won't work. So the constraint on this approach is, the special users here should not run applications that need data from external provider.
I'm creating this one to discuss further.