Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17663

SchedulableBuilder should handle invalid data access via scheduler.allocation.file

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.1.0
    • 2.2.0
    • Scheduler, Spark Core
    • None

    Description

      If spark.scheduler.allocation.file has invalid minShare or/and weight values, these cause :

      • NumberFormatException due to toInt function
      • SparkContext can not be initialized.
      • It does not show meaningful error message to user.

      In a nutshell, this functionality can be more robust by selecting one of the following flows :

      1- Currently, if schedulingMode has an invalid value, a warning message is logged and default value is set as FIFO. Same pattern can be used for minShare(default: 0) and weight(default: 1) as well
      2- Meaningful error message can be shown to the user for all invalid cases.

      Code to Reproduce :

      val conf = new SparkConf().setAppName("spark-fairscheduler").setMaster("local")
      conf.set("spark.scheduler.mode", "FAIR")
      conf.set("spark.scheduler.allocation.file", "src/main/resources/fairscheduler-invalid-data.xml")
      val sc = new SparkContext(conf)
      

      fairscheduler-invalid-data.xml :

      <allocations>
          <pool name="production">
              <schedulingMode>FIFO</schedulingMode>
              <weight>invalid_weight</weight>
              <minShare>2</minShare>
          </pool>
      </allocations>
      

      Stacktrace :

      Exception in thread "main" java.lang.NumberFormatException: For input string: "invalid_weight"
      	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
      	at java.lang.Integer.parseInt(Integer.java:580)
      	at java.lang.Integer.parseInt(Integer.java:615)
      	at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:272)
      	at scala.collection.immutable.StringOps.toInt(StringOps.scala:29)
      	at org.apache.spark.scheduler.FairSchedulableBuilder$$anonfun$org$apache$spark$scheduler$FairSchedulableBuilder$$buildFairSchedulerPool$1.apply(SchedulableBuilder.scala:127)
      	at org.apache.spark.scheduler.FairSchedulableBuilder$$anonfun$org$apache$spark$scheduler$FairSchedulableBuilder$$buildFairSchedulerPool$1.apply(SchedulableBuilder.scala:102)
      

      Attachments

        Activity

          People

            erenavsarogullari Eren Avsarogullari
            erenavsarogullari Eren Avsarogullari
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: