Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17648

TaskSchedulerImpl.resourceOffers should take an IndexedSeq, not a Seq

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.0.0
    • 2.1.0
    • Scheduler, Spark Core
    • None

    Description

      TaskSchedulerImpl.resourceOffer takes in a Seq[WorkerOffer]. however, later on it indexes into this by position. If you don't pass in an IndexedSeq, this turns an O operation in an O(n^2) operation.

      In practice, this isn't an issue, since just by chance the important places this is called, the datastructures happen to already be IndexedSeq s. But we ought to tighten up the types to make this more clear. I ran into this while doing some performance tests on the scheduler, and performance was terrible when I passed in a Seq and even a few hundred offers were scheduled very slowly.

      Attachments

        Activity

          People

            irashid Imran Rashid
            irashid Imran Rashid
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: