Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-1869

[C++] Large decimal values don't roundtrip correctly

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Test
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: parquet-cpp
    • Labels:
      None

      Description

      Reproducer with python:

      import decimal
      import pyarrow as pa
      import pyarrow.parquet as pq
      
      arr = pa.array([decimal.Decimal('9223372036854775808'), decimal.Decimal('1.111')])
      print(arr)
      
      pq.write_table(pa.table({'a': arr}), "test_decimal.parquet") 
      result = pq.read_table("test_decimal.parquet")
      print(result.column('a'))
      

      gives

      # before writing
      <pyarrow.lib.Decimal128Array object at 0x7fd07d79a468>
      [
        9223372036854775808.000,
        1.111
      ]
      # after reading
      <pyarrow.lib.ChunkedArray object at 0x7fd0711e9f98>
      [
        [
          -221360928884514619.392,
          1.111
        ]
      ]
      

      I tried reading the file with a different parquet implementation (fastparquet python package), and that gives the same values on read, so the issue might possibly rather be on the write side.

        Attachments

          Activity

          $i18n.getText('security.level.explanation', $currentSelection) Viewable by All Users
          Cancel

            People

            • Assignee:
              Unassigned Assign to me
              Reporter:
              jorisvandenbossche Joris Van den Bossche

              Dates

              • Created:
                Updated:

                Issue deployment