Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2043

ExternalAppendOnlyMap doesn't always find matching keys

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 0.9.0, 0.9.1, 1.0.0
    • 0.9.2, 1.0.1, 1.1.0
    • Spark Core
    • None

    Description

      The current implementation reads one key with the next hash code as it finishes reading the keys with the current hash code, which may cause it to miss some matches of the next key. This can cause operations like join to give the wrong result when reduce tasks spill to disk and there are hash collisions, as values won't be matched together.

      Attachments

        Activity

          People

            matei Matei Alexandru Zaharia
            matei Matei Alexandru Zaharia
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: