Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.8.0
-
None
-
Mac OS High Sierra
Python 3.6.3
Description
Minimal example to recreate:
import pandas as pd import pyarrow as pa df = pd.DataFrame({'a': []}) df['a'] = df['a'].astype(str) schema = pa.schema([pa.field('a', pa.string())]) pa.Table.from_pandas(df, schema=schema)
This causes the python interpreter to exit with "Segmentation fault: 11".
The following examples all work without any issue:
# column 'a' is no longer empty df = pd.DataFrame({'a': ['foo']}) df['a'] = df['a'].astype(str) schema = pa.schema([pa.field('a', pa.string())]) pa.Table.from_pandas(df, schema=schema)
# column 'a' is empty, but no schema is specified df = pd.DataFrame({'a': []}) df['a'] = df['a'].astype(str) pa.Table.from_pandas(df)
# column 'a' is empty, but no type 'str' specified in Pandas df = pd.DataFrame({'a': []}) schema = pa.schema([pa.field('a', pa.string())]) pa.Table.from_pandas(df, schema=schema)