Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-6327

[Python] Conversion of pandas.SparseArray columns in pandas.DataFrames to pyarrow.Table and back

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Minor
    • Resolution: Won't Do
    • None
    • None
    • Python
    • None

    Description

      We would like to convert sparse columns from Pandas to Arrow:

      import numpy as np
      import pandas
      import pyarrow
      
      arr = pandas.Series([1, 2, 3])
      sparr = pandas.SparseArray(np.array([1, 0, 0], dtype='int64'))
      df = pandas.DataFrame({'sparr': sparr, 'arr': arr})
      
      table = pyarrow.table(df)
      df == table.to_pandas()
      

      I assume `pandas.SparseArray` is a 1D sparse COO Tensor that would map to `pyarrow.SparseTensorCOO`.

      Attachments

        Activity

          People

            Unassigned Unassigned
            rokm Rok Mihevc
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: