Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-12294

Let distcp to bypass external attribute provider when calling getFileStatus etc at source cluster

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • None
    • None
    • hdfs
    • None

    Description

      This is an alternative solution for HDFS-12202, which proposed introducing a new set of API, with an additional boolean parameter bypassExtAttrProvider, so to let NN bypass external attribute provider when getFileStatus. The goal is to avoid distcp from copying attributes from one cluster's external attribute provider and save to another cluster's fsimage.

      The solution here is, instead of having an additional parameter, encode this parameter to the path itself, when calling getFileStatus (and some other calls), NN will parse the path, and figure out that whether external attribute provider need to be bypassed. The suggested encoding is to have a prefix to the path before calling getFileStatus, e.g. /ab/c becomes /.reserved/bypassExtAttr/a/b/c. NN will parse the path at the very beginning.

      Thanks much to andrew.wang for this suggestion. The scope of change is smaller and we don't have to change the FileSystem APIs.

      Attachments

        Issue Links

          There are no Sub-Tasks for this issue.

          Activity

            People

              yzhangal Yongjun Zhang
              yzhangal Yongjun Zhang
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: