Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-23720

Leverage shuffle service when running in non-host networking mode in hadoop 3 docker support

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.1.0
    • None
    • Spark Core
    • None

    Description

      In current external shuffle service integration, hostname of the executor and the shuffle service is the same while the port's are different (shuffle service port vs block manager port).
      When running in non-host networking mode under docker, in yarn, the shuffle service runs on the NM_HOST while the docker container run's under its own (ephemeral and generated) hostname.

      We should make use of the container's host machine's hostname for shuffle service and not the container hostname, when external shuffle is enabled.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              mridulm80 Mridul Muralidharan
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: