Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21807

The getAliasedConstraints function in LogicalPlan will take a long time when number of expressions is greater than 100

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.2.0
    • Fix Version/s: 2.3.0
    • Component/s: SQL
    • Labels:
      None
    • Target Version/s:

      Description

      The getAliasedConstraints fuction in LogicalPlan.scala will clone the expression set when an element added,
      and it will take a long time.
      Before modified, the cost of getAliasedConstraints is:
      100 expressions: 41 seconds
      150 expressions: 466 seconds

      The test is like this:
      test("getAliasedConstraints") {
      val expressionNum = 150
      val aggExpression = (1 to expressionNum).map(i => Alias(Count(Literal(1)), s"cnt$i")())
      val aggPlan = Aggregate(Nil, aggExpression, LocalRelation())

      val beginTime = System.currentTimeMillis()
      val expressions = aggPlan.validConstraints
      println(s"validConstraints cost: ${System.currentTimeMillis() - beginTime}ms")
      // The size of Aliased expression is n * (n - 1) / 2 + n
      assert( expressions.size === expressionNum * (expressionNum - 1) / 2 + expressionNum)

      }

        Attachments

          Activity

            People

            • Assignee:
              eaton eaton
              Reporter:
              eaton eaton
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: