Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.8.0
-
Ubuntu 16.04
Description
The pandas conversion of a datetimetz row index in a Table fails with non-UTC time zones because the values are stored as datetime64[ns] and interpreted as datetime64[ns, tz], rather than interpreted as datetime64[ns, UTC] and converted to datetime64[ns, tz]. There's correct handling for time zones for columns in Column.to_pandas, but not for the row index in table_to_blockmanager.
This is a minimal example demonstrating the failure of a roundtrip between a DataFrame and a Table:
import pandas as pd import pyarrow as pa df = pd.DataFrame({ 'a': pd.date_range( start='2017-01-01', periods=3, tz='America/New_York' ) }) df = df.set_index('a') df_pa = pa.Table.from_pandas(df).to_pandas() print(df) print(df_pa)
The output is:
Empty DataFrame Columns: [] Index: [2017-01-01 00:00:00-05:00, 2017-01-02 00:00:00-05:00, 2017-01-03 00:00:00-05:00] Empty DataFrame Columns: [] Index: [2017-01-01 05:00:00-05:00, 2017-01-02 05:00:00-05:00, 2017-01-03 05:00:00-05:00]
Attachments
Issue Links
- links to