Avro
  1. Avro
  2. AVRO-652

Expose sync points in DataFileReader

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.4.0
    • Component/s: java
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      DataFileReader.blockStart is hidden from the caller, but exposing it (as readonly) would allow callers to discover sync points in a file by watching it for changes.

      seek(long) only takes an exact block boundary, while sync(long) moves to the next block boundary. There does not appear to be a way to rediscover points that you can seek() to, which would be very useful for building or recovering an index for a file.

        Activity

        Hide
        Stu Hood added a comment -

        Adds the 'DataFileReader.previousSync()' method to return the last encountered synchronization point. Also adds a test which exposed a bug in seek(long).

        Show
        Stu Hood added a comment - Adds the 'DataFileReader.previousSync()' method to return the last encountered synchronization point. Also adds a test which exposed a bug in seek(long).
        Hide
        Stu Hood added a comment -

        Upgrading to a bug, due to the seek(long) issue.

        Show
        Stu Hood added a comment - Upgrading to a bug, due to the seek(long) issue.
        Hide
        Doug Cutting added a comment -

        I just committed this. Thanks, Stu!

        Show
        Doug Cutting added a comment - I just committed this. Thanks, Stu!

          People

          • Assignee:
            Stu Hood
            Reporter:
            Stu Hood
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development