Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.0.0
-
- macOS Big Sur 11.2.1
- python 3.8.2
Description
Exporting pyarrow.table that contains mixed-precision Decimals using parquet.write_table creates a parquet that contains invalid data/values.
In the example below the first value of data_decimal is turned from Decimal('579.11999511718795474735088646411895751953125000000000') in the pyarrow table to Decimal('-378.68971792399258172661600550482428224218070136475136') in the parquet.
import pyarrow from decimal import Decimal values_floats = [579.119995117188, 6.40999984741211, 2.0] # floats decs_from_values = [Decimal(v) for v in values_floats] # Decimal decs_from_float = [Decimal.from_float(v) for v in values_floats] decs_str = [Decimal(str(v)) for v in values_floats] # Decimal data_dict = {"data_decimal": decs_from_values, # python Decimal "data_decimal_from_float": decs_from_float, "data_float":values_floats, # python floats "data_dec_str": decs_str} table = pyarrow.table(data=data_dict) print(table.to_pydict()) # before saving pyarrow.parquet.write_table(table, "./pyarrow_decimal.parquet") # saving print(pyarrow.parquet.read_table("./pyarrow_decimal.parquet").to_pydict()) # after saving
Attachments
Issue Links
- links to