Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17648

TaskSchedulerImpl.resourceOffers should take an IndexedSeq, not a Seq

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 2.0.0
    • Fix Version/s: 2.1.0
    • Component/s: Scheduler, Spark Core
    • Labels:
      None
    • Target Version/s:

      Description

      TaskSchedulerImpl.resourceOffer takes in a Seq[WorkerOffer]. however, later on it indexes into this by position. If you don't pass in an IndexedSeq, this turns an O operation in an O(n^2) operation.

      In practice, this isn't an issue, since just by chance the important places this is called, the datastructures happen to already be IndexedSeq s. But we ought to tighten up the types to make this more clear. I ran into this while doing some performance tests on the scheduler, and performance was terrible when I passed in a Seq and even a few hundred offers were scheduled very slowly.

        Attachments

          Activity

            People

            • Assignee:
              irashid Imran Rashid
              Reporter:
              irashid Imran Rashid
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: