Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.9.0
Description
Original issue title: "Struct type inference and conversion works for lists but not NumPy arrays with dtype object"
Example, setup:
import pandas as pd s = pd.Series([{'data': {'document_id': None, 'document_type': None, 'master_customer_id': None, 'message': 'User Login Request', 'policy_id': None, 'sequence_no': 14, 'user_name': None}, 'header': {'actor_id': None, 'actor_type': None, 'brand_code': 'ES', 'event_origin': None, 'event_timestamp': '2018-01-01T18:25:43.511Z', 'event_type': 'LOGIN', 'master_customer_id': '14', 'source': 'CUSTOMER_AUTH_SERVICE', 'source_id': None, 'source_version': None}, 'payload_version': '1', 'status': {'status_code': 100, 'status_message': 'Success'}}])
This works:
In [24]: pa.array(list(s)) Out[24]: <pyarrow.lib.StructArray object at 0x7f8435b09c28> [ {'data': {'document_id': None, 'document_type': None, 'master_customer_id': None, 'message': 'User Login Request', 'policy_id': None, 'sequence_no': 14, 'user_name': None}, 'header': {'actor_id': None, 'actor_type': None, 'brand_code': 'ES', 'event_origin': None, 'event_timestamp': '2018-01-01T18:25:43.511Z', 'event_type': 'LOGIN', 'master_customer_id': '14', 'source': 'CUSTOMER_AUTH_SERVICE', 'source_id': None, 'source_version': None}, 'payload_version': '1', 'status': {'status_code': 100, 'status_message': 'Success'}} ]
This does not:
In [23]: pa.array(s) --------------------------------------------------------------------------- ArrowInvalid Traceback (most recent call last) <ipython-input-23-eba23a1638b7> in <module>() ----> 1 pa.array(s) ~/code/arrow/python/pyarrow/array.pxi in pyarrow.lib.array() 175 values, type = pdcompat.get_datetimetz_type(values, obj.dtype, 176 type) --> 177 return _ndarray_to_array(values, mask, type, from_pandas, pool) 178 else: 179 if mask is not None: ~/code/arrow/python/pyarrow/array.pxi in pyarrow.lib._ndarray_to_array() 75 76 with nogil: ---> 77 check_status(NdarrayToArrow(pool, values, mask, 78 use_pandas_null_sentinels, 79 c_type, &chunked_out)) ~/code/arrow/python/pyarrow/error.pxi in pyarrow.lib.check_status() 79 message = frombytes(status.message()) 80 if status.IsInvalid(): ---> 81 raise ArrowInvalid(message) 82 elif status.IsIOError(): 83 raise ArrowIOError(message) ArrowInvalid: ../src/arrow/python/numpy_to_arrow.cc:1742 code: converter.Convert() Error inferring Arrow type for Python object array. Got Python object of type dict but can only handle these types: string, bool, float, int, date, time, decimal, bytearray, list, array
Attachments
Attachments
Issue Links
- links to