Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
https://github.com/apache/arrow/blob/master/cpp/src/arrow/python/arrow_to_pandas.cc#L631
In Python 3, datetime.date objects are 32-bytes in addition to the PyObject*. So when there are many repeated dates, this will save a lot of memory in large DataFrame objects
Attachments
Issue Links
- is related to
-
ARROW-3928 [Python] Add option to deduplicate PyBytes / PyString / PyUnicode objects in Table.to_pandas conversion path
- Resolved
- relates to
-
ARROW-3899 [Python] Table.to_pandas converts Arrow date32[day] to pandas datetime64[ns]
- Resolved