Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-12294

Let distcp to bypass external attribute provider when calling getFileStatus etc at source cluster

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: hdfs
    • Labels:
      None

      Description

      This is an alternative solution for HDFS-12202, which proposed introducing a new set of API, with an additional boolean parameter bypassExtAttrProvider, so to let NN bypass external attribute provider when getFileStatus. The goal is to avoid distcp from copying attributes from one cluster's external attribute provider and save to another cluster's fsimage.

      The solution here is, instead of having an additional parameter, encode this parameter to the path itself, when calling getFileStatus (and some other calls), NN will parse the path, and figure out that whether external attribute provider need to be bypassed. The suggested encoding is to have a prefix to the path before calling getFileStatus, e.g. /ab/c becomes /.reserved/bypassExtAttr/a/b/c. NN will parse the path at the very beginning.

      Thanks much to Andrew Wang for this suggestion. The scope of change is smaller and we don't have to change the FileSystem APIs.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                yzhangal Yongjun Zhang
                Reporter:
                yzhangal Yongjun Zhang
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: