Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-23356

Pushes Project to both sides of Union when expression is non-deterministic

    Details

    • Type: Improvement
    • Status: In Progress
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.4.0
    • Fix Version/s: None
    • Component/s: SQL
    • Labels:
      None

      Description

      Currently, PushProjectionThroughUnion optimizer only supports pushdown project operator to both sides of a Union operator when expression is deterministic , in fact, we can be like pushdown filters, also support pushdown project operator to both sides of a Union operator when expression is non-deterministic , this PR description fix this problem。now the explain looks like:

      === Applying Rule org.apache.spark.sql.catalyst.optimizer.PushProjectionThroughUnion ===
      Input LogicalPlan:
      Project a#0, rand(10) AS rnd#9
      +- Union
      :- LocalRelation <empty>, a#0, b#1, c#2
      :- LocalRelation <empty>, d#3, e#4, f#5
      +- LocalRelation <empty>, g#6, h#7, i#8

      Output LogicalPlan:
      Project a#0, rand(10) AS rnd#9
      +- Union
      :- Project a#0
      : +- LocalRelation <empty>, a#0, b#1, c#2
      :- Project d#3
      : +- LocalRelation <empty>, d#3, e#4, f#5
      +- Project g#6
      +- LocalRelation <empty>, g#6, h#7, i#8

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              heary-cao caoxuewen
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated: