Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-6872

[C++][Python] Empty table with dictionary-columns raises ArrowNotImplementedError

    XMLWordPrintableJSON

Details

    Description

      Abstract

      As a pyarrow user, I would expect that I can create an empty table out of every schema that I created via pandas. This does not work for dictionary types (e.g. "category" dtypes).

      Test Case

      This code:

      import pandas as pd
      import pyarrow as pa
      
      df = pd.DataFrame({"x": pd.Series(["x", "y"], dtype="category")})
      table = pa.Table.from_pandas(df)
      schema = table.schema
      table_empty = schema.empty_table()  # boom
      

      produces this exception:

      Traceback (most recent call last):
        File "arrow_bug.py", line 8, in <module>
          table_empty = schema.empty_table()
        File "pyarrow/types.pxi", line 860, in __iter__
        File "pyarrow/array.pxi", line 211, in pyarrow.lib.array
        File "pyarrow/array.pxi", line 36, in pyarrow.lib._sequence_to_array
        File "pyarrow/error.pxi", line 86, in pyarrow.lib.check_status
      pyarrow.lib.ArrowNotImplementedError: Sequence converter for type dictionary<values=string, indices=int8, ordered=0> not implemented
      

      Attachments

        Issue Links

          Activity

            People

              jorisvandenbossche Joris Van den Bossche
              marco.neumann.by Marco Neumann
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h
                  1h