Uploaded image for project: 'Crunch (Retired)'
  1. Crunch (Retired)
  2. CRUNCH-213

Add sharded join functionality

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.7.0
    • None
    • None

    Description

      Performing joins where a large proportion of the values on one or both sides of the join are mapped to a single key can result in poor performance, as one (or a small number) of reducers end up handling most of the joining work, leaving the rest of the cluster idle.

      Sharded joining should be added to allow splitting up join keys, thereby distributing values mapped to a single key over multiple reducer partitions.

      Attachments

        1. CRUNCH-213.patch
          81 kB
          Gabriel Reid
        2. CRUNCH-213.patch
          86 kB
          Gabriel Reid
        3. CRUNCH-213.patch
          89 kB
          Gabriel Reid

        Activity

          People

            gabriel.reid Gabriel Reid
            gabriel.reid Gabriel Reid
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: