Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-2153

[C++/Python] Decimal conversion not working for exponential notation

    Details

      Description

      import pyarrow as pa
      import pandas as pd
      import decimal
      
      pa.Table.from_pandas(pd.DataFrame({'a': [decimal.Decimal('1.1'), decimal.Decimal('2E+1')]}))
      

       

      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
        File "pyarrow/table.pxi", line 875, in pyarrow.lib.Table.from_pandas (/arrow/python/build/temp.linux-x86_64-3.6/lib.cxx:44927)
        File "/home/skadlec/.local/lib/python3.6/site-packages/pyarrow/pandas_compat.py", line 350, in dataframe_to_arrays
          convert_types)]
        File "/home/skadlec/.local/lib/python3.6/site-packages/pyarrow/pandas_compat.py", line 349, in <listcomp>
          for c, t in zip(columns_to_convert,
        File "/home/skadlec/.local/lib/python3.6/site-packages/pyarrow/pandas_compat.py", line 345, in convert_column
          return pa.array(col, from_pandas=True, type=ty)
        File "pyarrow/array.pxi", line 170, in pyarrow.lib.array (/arrow/python/build/temp.linux-x86_64-3.6/lib.cxx:29224)
        File "pyarrow/array.pxi", line 70, in pyarrow.lib._ndarray_to_array (/arrow/python/build/temp.linux-x86_64-3.6/lib.cxx:28465)
        File "pyarrow/error.pxi", line 77, in pyarrow.lib.check_status (/arrow/python/build/temp.linux-x86_64-3.6/lib.cxx:8270)
      pyarrow.lib.ArrowInvalid: Expected base ten digit or decimal point but found 'E' instead.
      

      In manual cases clearly we can write decimal.Decimal('20') instead of decimal.Decimal('2E+1') but during arithmetical operations inside an application the exponential notation can be produced out of control (it is actually the normalized form of the decimal number) plus for some values the exponential notation is the only form expressing the significance so this should be accepted.

      The documentation suggests using following transformation but that's only possible when the significance information doesn't need to be kept:

      def remove_exponent(d):
          return d.quantize(Decimal(1)) if d == d.to_integral() else d.normalize()
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                cpcloud Phillip Cloud
                Reporter:
                antonymayi Antony Mayi
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: