Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-22590

Broadcast thread propagates the localProperties to task

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.2.0, 2.4.4, 3.0.0
    • 3.0.0
    • Spark Core

    Description

      Local properties set via sparkContext are not available as TaskContext properties when executing parallel jobs and threadpools have idle threads

      Explanation:
      When executing parallel jobs via BroadcastExchangeExec, the relationFuture is evaluated via a seperate thread. The threads inherit the localProperties from sparkContext as they are the child threads.
      These threads are controlled via the executionContext (thread pools). Each Thread pool has a default keepAliveSeconds of 60 seconds for idle threads.
      Scenarios where the thread pool has threads which are idle and reused for a subsequent new query, the thread local properties will not be inherited from spark context (thread properties are inherited only on thread creation) hence end up having old or no properties set. This will cause taskset properties to be missing when properties are transferred by child thread via sparkContext.runJob/submitJob

      Attached is a test-case to simulate this behavior

      Attachments

        1. TestProps.scala
          3 kB
          Ajith S

        Issue Links

          Activity

            People

              ajithshetty Ajith S
              ajithshetty Ajith S
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: