Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-564

[Python] Add methods to return vanilla NumPy arrays (plus boolean mask array if there are nulls)

    XMLWordPrintableJSON

    Details

      Description

      At the moment, for pyarrow.Array instances, we have a method called to_pandas. While this method returns NumPy Arrays, it returns them in the form that Pandas would use them in its Series. The difference here is visible for example in the case of integers with null values. For Pandas, we convert it into a float array and set all entries to NaN where we have null entries in the Arrow array. For vanilla NumPy arrays, we would return a tuple of a valid bytemap (not bitmap!) and a values array. The values array in this case should simply be a view on the underlying Arrow buffer.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                frg Florian Rathgeber
                Reporter:
                wesm Wes McKinney
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 3h 20m
                  3h 20m