Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-13181

Spark delay in task scheduling within executor

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Problem
    • 1.5.2
    • 1.5.2
    • Spark Core
    • None

    Description

      When Spark job with some RDD in memory and some in Hadoop, the tasks within Executor which reads from memory is started parallel but task to read from hadoop is started after some delay.

      Repro:

      A logFile of 1.25 GB is given as input. (5 RDD each of 256MB)

      val logData = sc.textFile(logFile, 2).cache()
      var numAs = logData.filter(line => line.contains("a")).count()
      var numBs = logData.filter(line => line.contains("b")).count()

      Run Spark Job with 1 executor with 6GB memory, 12 cores

      Stage A (reading line with a) - executor starts 5 tasks parallel and all reads data from Hadoop.

      Stage B(reading line with b) - As the data is cached (4 RDD is in memory, 1 is in Hadoop) - executor starts 4 tasks parallel and after 4 seconds delay, starts the last task to read from Hadoop.

      On Running the same Spark Job with 12GB memory, all 5 RDD are in memory ans 5 tasks in Stage B started parallel.

      On Running the job with 2GB memory, all 5 RDD are in Hadoop and 5 tasks in stage B started parallel.

      The task delay happens only when some RDD in memory and some in Hadoop.

      Check the attached image.

      Attachments

        1. ran3.JPG
          94 kB
          Prabhu Joseph

        Activity

          prabhujoseph Prabhu Joseph added a comment -

          Okay, the reason for the task delay within executor when some RDD in memory and some in Hadoop i.e, Multiple Locality Levels NODE_LOCAL and ANY, in this case Scheduler waits for spark.locality.wait 3 seconds default. During this period, scheduler waits to launch a data-local task before giving up and launching it on a less-local node. So after making it 0, all tasks started parallel. But learned that it is better not to reduce it to 0.

          prabhujoseph Prabhu Joseph added a comment - Okay, the reason for the task delay within executor when some RDD in memory and some in Hadoop i.e, Multiple Locality Levels NODE_LOCAL and ANY, in this case Scheduler waits for spark.locality.wait 3 seconds default. During this period, scheduler waits to launch a data-local task before giving up and launching it on a less-local node. So after making it 0, all tasks started parallel. But learned that it is better not to reduce it to 0.

          People

            Unassigned Unassigned
            prabhujoseph Prabhu Joseph
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: