Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-7181

External Sorter merge with aggregation go to an infinite loop when we have a total ordering

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 1.3.1
    • 1.2.3, 1.3.2, 1.4.0
    • Spark Core
    • None

    Description

      In the function mergeWithAggregation of ExternalSorter.scala, when there is a total ordering for keys K, values of the same key in the sorted iterator should be combined. Currently this is done by this:

        val elem = sorted.next()
        val k = elem._1
        var c = elem._2
        while (sorted.hasNext && sorted.head._1 == k) {
          c = mergeCombiners(c, sorted.head._2)
        }
      

      This will go to an infinite loop when there are more than 1 values with the same key. `sorted.next()` should be called to fix this.

      Attachments

        Activity

          People

            chouqin Qiping Li
            chouqin Qiping Li
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: