Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-13473

Predicate can't be pushed through project with nondeterministic field

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.5.2, 1.6.0, 2.0.0
    • Fix Version/s: 1.5.3, 1.6.1, 2.0.0
    • Component/s: SQL
    • Labels:
      None

      Description

      The following Spark shell snippet reproduces this issue:

      import org.apache.spark.sql.functions._
      
      val parallelism = 8 // Adjust this to default parallelism
      val df = sqlContext.
        range(2 * parallelism). // 8 partitions, 2 elements per partition
        select(
          col("id"),
          monotonicallyIncreasingId().as("long_id")
        )
      
      df.show()
      
      // +---+-----------+
      // | id|    long_id|
      // +---+-----------+
      // |  0|          0|
      // |  1|          1|
      // |  2| 8589934592|
      // |  3| 8589934593|
      // |  4|17179869184|
      // |  5|17179869185|
      // |  6|25769803776|
      // |  7|25769803777|
      // |  8|34359738368|
      // |  9|34359738369|
      // | 10|42949672960|
      // | 11|42949672961|
      // | 12|51539607552|
      // | 13|51539607553|
      // | 14|60129542144|
      // | 15|60129542145|
      // +---+-----------+
      
      df.
        filter(col("id") === 3). // 2nd element in the 2nd partition
        show()
      
      // +---+----------+
      // | id|   long_id|
      // +---+----------+
      // |  3|8589934592|
      // +---+----------+
      
      

      monotonicallyIncreasingId is nondeterministic.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                lian cheng Cheng Lian
                Reporter:
                lian cheng Cheng Lian
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: