Uploaded image for project: 'Apache Avro'
  1. Apache Avro
  2. AVRO-3013

Avro files should allow fsync-ing files to disk in Python

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsAdd voteVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • python
    • None

    Description

      I am new to Apache, but here I am...

      In our use case, we need to constantly update an existing avro file. The way we did it is that we copy the old avro file to a temporary file, append data to the temporary file, close the temporary file, and rename the temporary file to the original avro file. This is problematic since closing a file does not guarantee to write data to disk. The bug caused by this is hard to track since it's hard to reproduce.

      I noticed that there is a ticket that addresses this for the Java client https://issues.apache.org/jira/browse/AVRO-1388. Why isn't it implemented for the Python client? If there are no objections, I'd like to submit a patch. Or perhaps I am missing something here? Please let me know!

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned Assign to me
            miluChen He Chen

            Dates

              Created:
              Updated:

              Slack

                Issue deployment