Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-3514

[Python] zlib deflate exception when writing Parquet file

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.11.0
    • 0.11.1
    • C++, Python
    • Amazon Linux, CentOS 7, Ubuntu 16.04, zlib 1.2.7/1.2.8, CPython 3.6.

    Description

      The below Python code throws an exception in 0.11.0, but not in 0.10.0.

      I was able to reproduce the issue in Amazon Linux, CentOS 7, and Ubuntu 16.04, but not in Windows 7.

      The Amazon and CentOS machines are both running zlib 1.2.7, and the Ubuntu machine is using 1.2.8.

      Tested with CPython 3.6 in all cases.

      import io
      import pyarrow
      from pyarrow import parquet
      
      tbl = pyarrow.Table.from_arrays([pyarrow.array(['abc', 'def'])], ['some_col'])
      
      f = io.BytesIO()
      parquet.write_table(tbl, f, compression='gzip')
      

      Following is the exception:

      Traceback (most recent call last):
        File "test_pyarrow.py", line 8, in <module>
          parquet.write_table(tbl, f, compression='gzip')
        File "/home/adam/anaconda3/lib/python3.6/site-packages/pyarrow/parquet.py", line 1125, in write_table
          writer.write_table(table, row_group_size=row_group_size)
        File "/home/adam/anaconda3/lib/python3.6/site-packages/pyarrow/parquet.py", line 376, in write_table
          self.writer.write_table(table, row_group_size=row_group_size)
        File "pyarrow/_parquet.pyx", line 934, in pyarrow._parquet.ParquetWriter.write_table
        File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
      pyarrow.lib.ArrowIOError: Arrow error: IOError: zlib deflate failed, output buffer too small
      

      Attachments

        Issue Links

          Activity

            People

              apitrou Antoine Pitrou
              amachanic Adam Machanic
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 20m
                  2h 20m