Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-2020

[Python] Parquet segfaults if coercing ns timestamps and writing 96-bit timestamps

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 0.8.0
    • Fix Version/s: 0.10.0
    • Component/s: Python
    • Labels:
    • Environment:
      OS: Mac OS X 10.13.2
      Python: 3.6.4
      PyArrow: 0.8.0

      Description

      If you try to write a PyArrow table containing nanosecond-resolution timestamps to Parquet using `coerce_timestamps` and `use_deprecated_int96_timestamps=True`, the Arrow library will segfault.

      The crash doesn't happen if you don't coerce the timestamp resolution or if you don't use 96-bit timestamps.

       

       

      To Reproduce:

       

       
      import datetime
      
      import pyarrow
      from pyarrow import parquet
      
      schema = pyarrow.schema([
          pyarrow.field('last_updated', pyarrow.timestamp('ns')),
      ])
      
      data = [
          pyarrow.array([datetime.datetime.now()], pyarrow.timestamp('ns')),
      ]
      
      table = pyarrow.Table.from_arrays(data, ['last_updated'])
      
      with open('test_file.parquet', 'wb') as fdesc:
          parquet.write_table(table, fdesc,
                              coerce_timestamps='us',  # 'ms' works too
                              use_deprecated_int96_timestamps=True)

       

      See attached file for the crash report.

       

        Attachments

        1. crash-report.txt
          53 kB
          Diego Argueta

          Issue Links

            Activity

              People

              • Assignee:
                joshuastorck Joshua Storck
                Reporter:
                yiannisliodakis Diego Argueta
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: