Uploaded image for project: 'Apache Avro'
  1. Apache Avro
  2. AVRO-3013

Avro files should allow fsync-ing files to disk in Python

Add voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • python
    • None

    Description

      I am new to Apache, but here I am...

      In our use case, we need to constantly update an existing avro file. The way we did it is that we copy the old avro file to a temporary file, append data to the temporary file, close the temporary file, and rename the temporary file to the original avro file. This is problematic since closing a file does not guarantee to write data to disk. The bug caused by this is hard to track since it's hard to reproduce.

      I noticed that there is a ticket that addresses this for the Java client https://issues.apache.org/jira/browse/AVRO-1388. Why isn't it implemented for the Python client? If there are no objections, I'd like to submit a patch. Or perhaps I am missing something here? Please let me know!

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            miluChen He Chen

            Dates

              Created:
              Updated:

              Slack

                Issue deployment