Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-4565

Hot key fanout should not distribute keys to all shards.

Details

    • Task
    • Status: Open
    • P3
    • Resolution: Unresolved
    • 2.0.0, 2.1.0, 2.2.0, 2.3.0, 2.4.0, 2.5.0
    • None
    • sdk-java-core, sdk-py-core

    Description

      The goal is to reduce the number of value sent to a single post-GBK worker. If combiner lifting happens, each bundle will sends a single value per sub-key, causing an N-fold blowup in shuffle data and N reducers with the same amount of data to consume as the single reducer in the non-fanout case. 

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              robertwb Robert Bradshaw

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h
                  1h

                  Slack

                    Issue deployment