Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
0.5.0
-
None
Description
This is correctly raising (because categorical is not implemented), but it is creating an empty file.
xref https://github.com/pandas-dev/pandas/pull/15838#pullrequestreview-52576290
In [2]: df = pd.DataFrame({'a': list('abc'), ...: 'b': list(range(1, 4)), ...: 'c': np.arange(3, 6).astype('u1'), ...: 'd': np.arange(4.0, 7.0, dtype='float64'), ...: 'e': [True, False, True], ...: 'f': pd.Categorical(list('abc')), ...: 'g': pd.date_range('20130101', periods=3), ...: 'h': pd.date_range('20130101', periods=3, tz='US/Eastern'), ...: 'i': pd.date_range('20130101', periods=3, freq='ns')}) ...: In [3]: df.to_parquet('foo.pq') --------------------------------------------------------------------------- --------------------------------------------------------------------------- ArrowNotImplementedError Traceback (most recent call last) <ipython-input-3-8070fb7e3e2c> in <module>() ----> 1 df.to_parquet('foo.pq') /Users/jreback/pandas/pandas/core/frame.py in to_parquet(self, fname, engine, compression, **kwargs) 1620 from pandas.io.parquet import to_parquet 1621 to_parquet(self, fname, engine, -> 1622 compression=compression, **kwargs) 1623 1624 @Substitution(header='Write out column names. If a list of string is given, \ /Users/jreback/pandas/pandas/io/parquet.py in to_parquet(df, path, engine, compression, **kwargs) 152 raise ValueError("parquet must have string column names") 153 --> 154 return impl.write(df, path, compression=compression) 155 156 /Users/jreback/pandas/pandas/io/parquet.py in write(self, df, path, compression, **kwargs) 53 table = self.api.Table.from_pandas(df, timestamps_to_ms=True) 54 self.api.parquet.write_table( ---> 55 table, path, compression=compression, **kwargs) 56 57 def read(self, path): /Users/jreback/miniconda3/envs/pandas/lib/python3.6/site-packages/pyarrow/parquet.py in write_table(table, where, row_group_size, version, use_dictionary, compression, use_deprecated_int96_timestamps, **kwargs) 770 version=version, 771 use_deprecated_int96_timestamps=use_deprecated_int96_timestamps) --> 772 writer = ParquetWriter(where, table.schema, **options) 773 writer.write_table(table, row_group_size=row_group_size) 774 writer.close() _parquet.pyx in pyarrow._parquet.ParquetWriter.__cinit__() error.pxi in pyarrow.lib.check_status() ArrowNotImplementedError: NotImplemented: unhandled type In [4]: !ls -ltr foo.pq -rw-r--r-- 1 jreback staff 0 Jul 27 06:03 foo.pq