Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-13436

[Python][Doc] Clarify what should be expected if read_table is passed an empty list of columns

    XMLWordPrintableJSON

Details

    Description

      The documentation for pyarrow.parquet.read_table states:

       

      • columns (list) – If not None, only these columns will be read from the file. A column name may be a prefix of a nested field, e.g. ‘a’ will select ‘a.b’, ‘a.c’, and ‘a.d.e’.

       

      It is not clear what should be the expected result if columns is an empty list.  In pyarrow 3.0 this read in all columns (as long as use_legacy_dataset=False).  In pyarrow 4.0 this doesn't read in any columns.  I think this behavior (not reading in any columns) is the correct behavior (since None can be used for all columns) but we should clarify that in the docs.

      Attachments

        Issue Links

          Activity

            People

              sakras Sasha Krassovsky
              westonpace Weston Pace
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h
                  1h