Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-6924

driver hangs when net is broken

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Won't Fix
    • None
    • None
    • Spark Core
    • None

    Description

      In yarn-client mode, client is deployed out side of cluster. When the net between client and cluster is broken, driver lost all executors. In normal situation, client returns and app fails. Actually, the driver hangs, user do not know whether app is ok. So we should let driver return not hang.
      The solution: in HeartbeatReceiver thread, check whether some executor send heartbeat to dirver at the fixed rate. If no execuor send heartbeats to driver, close SparkContext.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              xukun xukun
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: