Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-1731

[Python] Provide for selecting a subset of columns to convert in RecordBatch/Table.from_pandas

    XMLWordPrintableJSON

Details

    Description

      Currently it's all-or-nothing, and to do the subsetting in pandas incurs a data copy. This would enable columns (by name or index) to be selected out without additional data copying. We should add a columns= argument to the the from_pandas calls and do the subsetting when we dispatch the individual arrays for conversion to Arrow.

      cc cpcloud jreback

      Attachments

        Issue Links

          Activity

            People

              apitrou Antoine Pitrou
              wesm Wes McKinney
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1.5h
                  1.5h