Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-38570

Incorrect DynamicPartitionPruning caused by Literal

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 3.2.0
    • 3.0.4, 3.3.0, 3.1.4, 3.2.2
    • SQL
    • None

    Description

      The return value of Literal.references is an empty AttributeSet, so Literal is mistaken for a partition column.

       

      org.apache.spark.sql.execution.dynamicpruning.PartitionPruning#getFilterableTableScan:

      val srcInfo: Option[(Expression, LogicalPlan)] = findExpressionAndTrackLineageDown(a, plan)
      srcInfo.flatMap {
        case (resExp, l: LogicalRelation) =>
          l.relation match {
            case fs: HadoopFsRelation =>
              val partitionColumns = AttributeSet(
                l.resolve(fs.partitionSchema, fs.sparkSession.sessionState.analyzer.resolver))
              // When resExp is a Literal, Literal is considered a partition column.         
              if (resExp.references.subsetOf(partitionColumns)) {
                return Some(l)
              } else {
                None
              }
            case _ => None
          } 

      Attachments

        Activity

          People

            mcdull_zhang mcdull_zhang
            mcdull_zhang mcdull_zhang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: