Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2043

ExternalAppendOnlyMap doesn't always find matching keys

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.9.0, 0.9.1, 1.0.0
    • Fix Version/s: 0.9.2, 1.0.1, 1.1.0
    • Component/s: Spark Core
    • Labels:
      None

      Description

      The current implementation reads one key with the next hash code as it finishes reading the keys with the current hash code, which may cause it to miss some matches of the next key. This can cause operations like join to give the wrong result when reduce tasks spill to disk and there are hash collisions, as values won't be matched together.

        Attachments

          Activity

            People

            • Assignee:
              matei Matei Alexandru Zaharia
              Reporter:
              matei Matei Alexandru Zaharia

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment