Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-16761

[C++][Python] Track bytes_written on FileWriter / WrittenFile

    XMLWordPrintableJSON

Details

    Description

      For Apache Iceberg and Delta Lake tables, we need to be able to get the size of the files written in bytes. In Iceberg, this is the required fieldĀ file_size_in_bytes (docs). In Delta, this is the required field size as part of the Add action.

      I think this could be exposed on FileWriter and then through that WrittenFile. But lower-level than that I'm not yet sure. FileWriter owns its OutputStream; would OutputStream::Tell() give the correct count?

      Attachments

        Issue Links

          Activity

            People

              wjones127 Will Jones
              wjones127 Will Jones
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 20m
                  1h 20m