Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-13080

Specify numBuckets in Python Reshuffle

Details

    • New Feature
    • Status: Resolved
    • P2
    • Resolution: Fixed
    • None
    • 2.35.0
    • sdk-py-core
    • None

    Description

      While Java has `withNumBuckets` to set the amount of generated keys, Python doesn't.

      This is particularly need when your input data is not so high and you need to take advantage for DoFn methods `start / finish bundle`.

      https://beam.apache.org/releases/javadoc/2.33.0/org/apache/beam/sdk/transforms/Reshuffle.ViaRandomKey.html#withNumBuckets-java.lang.Integer-

      Attachments

        Activity

          People

            Unassigned Unassigned
            Inigosj Inigo San Jose Visiers
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 5h 50m
                5h 50m