Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
cpp-1.2.0
-
None
Description
Reading the attached parquet file into pandas dataframe and then using the dataframe segfaults.
Python 3.5.3 |Continuum Analytics, Inc.| (default, Mar 6 2017, 11:58:13) [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux Type "help", "copyright", "credits" or "license" for more information. >>> >>> import pyarrow >>> import pyarrow.parquet as pq >>> pyarrow.__version__ '0.6.0' >>> import pandas as pd >>> pd.__version__ '0.19.0' >>> df = pq.read_table('part-00000-6570e34b-b42c-4a39-8adf-21d3a97fb87d.snappy.parquet') \ ... .to_pandas() >>> len(df) 69 >>> df.info() <class 'pandas.core.frame.DataFrame'> RangeIndex: 69 entries, 0 to 68 Data columns (total 6 columns): label 69 non-null int32 account_meta 69 non-null object features_type 69 non-null int32 features_size 69 non-null int32 features_indices 1 non-null object features_values 1 non-null object dtypes: int32(3), object(3) memory usage: 2.5+ KB >>> >>> pd.concat([df, df]) Segmentation fault (core dumped)
Actually just print(df) is enough to trigger the segfault
Attachments
Attachments
Issue Links
- links to