Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2094

Ensure exactly once semantics for DDL / Commands

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.1.0
    • SQL
    • None

    Description

      From lian cheng...
      The constraints presented here are:

      • The side effect of a command SchemaRDD should take place eagerly;
      • The side effect of a command SchemaRDD should take place once and only once;
      • When .collect() method is called, something meaningful, usually the output message lines of the command, should be presented.

      Then how about adding a lazy field inside all the physical command nodes to wrap up the side effect and hold the command output? Take the SetCommandPhysical as an example:

      trait PhysicalCommand(@transient context: SQLContext) {
         lazy val commandOutput: Any
      }
      
      case class SetCommandPhysical(
          key: Option[String], value: Option[String], output: Seq[Attribute])(
          @transient context: SQLContext)
        extends PhysicalCommand(context)
        with PhysicalCommand {
      
        override lazy val commandOutput = {
          // Perform the side effect, and record appropriate output
          ???
        }
      
        def execute(): RDD[Row] = {
          val row = new GenericRow(Array[Any](commandOutput))
          context.sparkContext.parallelize(row, 1)
        }
      }
      

      In this way, all the constraints are met:

      • Eager evaluation: done by the toRdd call in SchemaRDDLike (PR #948),
      • Side effect should take place once and only once: ensured by the lazy commandOutput field,
      • Present meaningful output as RDD contents: command output is held by commandOutput and returned in execute().

      An additional benefit is that, side effect logic of all the commands can be implemented within their own physical command nodes, instead of adding special cases inside SQLContext.toRdd and/or HiveContext.toRdd.

      Attachments

        Activity

          People

            lian cheng Cheng Lian
            marmbrus Michael Armbrust
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: