Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
0.8.0
-
Python 3.6.4. Mac OSX and CentOS Linux release 7.3.1611. Pandas 0.21.1 .
Description
The following code
import pyarrow as pa import pyarrow.parquet as pq import pandas as pd n=3 df = pd.DataFrame({'x': range(n)}, index=pd.DatetimeIndex(start='2017-01-01', freq='1n', periods=n)) pq.write_table(pa.Table.from_pandas(df), '/tmp/t.parquet')
results in:
ArrowInvalid: Casting from timestamp[ns] to timestamp[us] would lose data: 1483228800000000001
The desired effect is that we can save nanosecond resolution without losing precision (e.g. conversion to ms). Note that if freq='1u' is used, the code runs properly.
Attachments
Issue Links
- depends upon
-
PARQUET-1411 [C++] Upgrade to use LogicalType annotations instead of ConvertedType
- Resolved
- is related to
-
ARROW-2026 [Python] Cast all timestamp resolutions to INT96 use_deprecated_int96_timestamps=True
- Resolved
-
ARROW-3729 [C++] Support for writing TIMESTAMP_NANOS Parquet metadata
- Resolved