Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-1869

[C++] Large decimal values don't roundtrip correctly

    XMLWordPrintableJSON

Details

    • Test
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • parquet-cpp
    • None

    Description

      Reproducer with python:

      import decimal
      import pyarrow as pa
      import pyarrow.parquet as pq
      
      arr = pa.array([decimal.Decimal('9223372036854775808'), decimal.Decimal('1.111')])
      print(arr)
      
      pq.write_table(pa.table({'a': arr}), "test_decimal.parquet") 
      result = pq.read_table("test_decimal.parquet")
      print(result.column('a'))
      

      gives

      # before writing
      <pyarrow.lib.Decimal128Array object at 0x7fd07d79a468>
      [
        9223372036854775808.000,
        1.111
      ]
      # after reading
      <pyarrow.lib.ChunkedArray object at 0x7fd0711e9f98>
      [
        [
          -221360928884514619.392,
          1.111
        ]
      ]
      

      I tried reading the file with a different parquet implementation (fastparquet python package), and that gives the same values on read, so the issue might possibly rather be on the write side.

      Attachments

        Activity

          People

            Unassigned Unassigned
            jorisvandenbossche Joris Van den Bossche
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: