Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-3788

Yarn dist cache code is not friendly to HDFS HA, Federation

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.1.1, 1.2.0
    • YARN
    • None

    Description

      There are two bugs here.

      1. The compareFs() method in ClientBase considers the 'host' part of the URI to be an actual host. In the case of HA and Federation, that's a namespace name, which doesn't resolve to anything. So in those cases, compareFs() always says the file systems are different.

      2. In prepareLocalResources(), when adding a file to the distributed cache, that is done with the common FileSystem object instantiated at the start of the method. In the case of Federation that doesn't work: the qualified URL's scheme may differ from the non-qualified one, so the FileSystem instance will not work.

      Fixes are pretty trivial.

      Attachments

        Issue Links

          Activity

            People

              vanzin Marcelo Masiero Vanzin
              vanzin Marcelo Masiero Vanzin
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: