Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21807

The getAliasedConstraints function in LogicalPlan will take a long time when number of expressions is greater than 100

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.2.0
    • 2.3.0
    • SQL
    • None

    Description

      The getAliasedConstraints fuction in LogicalPlan.scala will clone the expression set when an element added,
      and it will take a long time.
      Before modified, the cost of getAliasedConstraints is:
      100 expressions: 41 seconds
      150 expressions: 466 seconds

      The test is like this:
      test("getAliasedConstraints") {
      val expressionNum = 150
      val aggExpression = (1 to expressionNum).map(i => Alias(Count(Literal(1)), s"cnt$i")())
      val aggPlan = Aggregate(Nil, aggExpression, LocalRelation())

      val beginTime = System.currentTimeMillis()
      val expressions = aggPlan.validConstraints
      println(s"validConstraints cost: ${System.currentTimeMillis() - beginTime}ms")
      // The size of Aliased expression is n * (n - 1) / 2 + n
      assert( expressions.size === expressionNum * (expressionNum - 1) / 2 + expressionNum)

      }

      Attachments

        Activity

          People

            eaton eaton
            eaton eaton
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: