Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17306

QuantileSummaries doesn't compress

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.0.1, 2.1.0
    • SQL
    • None

    Description

      compressThreshold was not referenced anywhere

      class QuantileSummaries(
          val compressThreshold: Int,
          val relativeError: Double,
          val sampled: ArrayBuffer[Stats] = ArrayBuffer.empty,
          private[stat] var count: Long = 0L,
          val headSampled: ArrayBuffer[Double] = ArrayBuffer.empty) extends Serializable
      

      And, it causes memory leak, QuantileSummaries takes unbounded memory

      val summary = new QuantileSummaries(10000, relativeError = 0.001)
      // Results in creating an array of size 100000000 !!! 
      (1 to 100000000).foreach(summary.insert(_))
      

      Attachments

        Activity

          People

            srowen Sean R. Owen
            clockfly Sean Zhong
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: