Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21668

Ability to run driver programs within a container

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

      Description

      When a driver program in Client mode runs in a Docker container, it binds to the IP address of the container, not the host machine. This container IP address is accessible only within the host machine, it is inaccessible for master and worker nodes.
      For example, the host machine has IP address 192.168.216.10. When Docker machine starts a container, it places it to a special bridged network and assigns it an IP address like 172.17.0.2. All Spark nodes belonging to the 192.168.216.0 network cannot access the bridged network with the container. Therefore, the driver program is not able to communicate with the Spark cluster.
      Spark already provides SPARK_PUBLIC_DNS environment variable for this purpose. However, in this scenario setting SPARK_PUBLIC_DNS to the host machine IP address does not work.

      Topic on StackOverflow: https://stackoverflow.com/questions/45489248/running-spark-driver-program-in-docker-container-no-connection-back-from-execu

        Attachments

        Issue Links

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              tashoyan Arseniy Tashoyan

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 96h
                96h
                Remaining:
                Remaining Estimate - 96h
                96h
                Logged:
                Time Spent - Not Specified
                Not Specified

                  Issue deployment