Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-9952

[Python] Use pyarrow.dataset writing for pq.write_to_dataset

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.0.0
    • Python

    Description

      Now ARROW-9658 and ARROW-9893 are in, we can explore using the pyarrow.dataset writing capabilities in parquet.write_to_dataset.

      Similarly as was done in pq.read_table, we could initially have a keyword to switch between both implementations, eventually defaulting to the new datasets one, and to deprecated the old (inefficient) python implementation.

      Attachments

        Issue Links

          Activity

            People

              jorisvandenbossche Joris Van den Bossche
              jorisvandenbossche Joris Van den Bossche
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h
                  1h