Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-2716

AvroReader should refuse dynamic splits while in the last block

Details

    • Bug
    • Status: Open
    • P3
    • Resolution: Unresolved
    • None
    • None
    • io-java-avro
    • None

    Description

      AvroReader is able to detect when it's in the last block:
      https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroSource.java#L728

      It could also use this information to avoid wastefully producing dynamic splits starting in the range of the current block.

      One way to do this would be to have OffsetRangeTracker have a "claim range" operation: claim range of [a, b) is, in terms of correctness, equivalent to claiming "a" (it checks whether "a" is within the range), but sets the last claimed position to "b" rather than "a", thus protecting more positions from being split away.

      Attachments

        Activity

          People

            Unassigned Unassigned
            jkff Eugene Kirpichov
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: