Uploaded image for project: 'Apache Avro'
  1. Apache Avro
  2. AVRO-1035

Add the possibility to append to existing avro files

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • None
    • None

    Description

      Currently it is not possible to append to avro files that were written and closed.

      Here is a Scott Carey's reply on the mailing list:

      It is not possible without modifying DataFileWriter. Please open a JIRA
      ticket.

      It could not simply append to an OutputStream, since it must either:

      • Seek to the start to validate the schemas match and find the sync
        marker, or
      • Trust that the schemas match and find the sync marker from the last block

      DataFileWriter cannot refer to Hadoop classes such as FileSystem, but we
      could add something to the mapred module that takes a Path and FileSystem
      and returns
      something that implemements an interface that DataFileWriter can append
      to. This would be something that is both a
      http://avro.apache.org/docs/1.6.2/api/java/org/apache/avro/file/SeekableInp
      ut.html
      and an OutputStream, or has both an InputStream from the start of the
      existing file and an OutputStream at the end.

      Attachments

        Activity

          People

            Unassigned Unassigned
            detonator413 Vyacheslav Zholudev
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: