Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-5436

[Python] expose filters argument in parquet.read_table

    XMLWordPrintableJSON

    Details

      Description

      Currently, the parquet.read_table function can be used both for reading a single file (interface to ParquetFile) as a directory (interface to ParquetDataset).

      ParquetDataset has some extra keywords such as filters that would be nice to expose through read_table as well.

      Of course one can always use ParquetDataset if you need its power, but for pandas wrapping pyarrow it is easier to be able to pass through keywords just to parquet.read_table instead of calling either read_table or ParquetDataset. Context: https://github.com/pandas-dev/pandas/issues/26551

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                jorisvandenbossche Joris Van den Bossche
                Reporter:
                jorisvandenbossche Joris Van den Bossche
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 20m
                  1h 20m