Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-2591

[Python] Segmentation fault when writing empty ListType column to Parquet

    XMLWordPrintableJSON

Details

    Description

      Context Is the following: I am currently dealing with sparse column serialization in parquet. In some cases, many lines are empty I can also have columns containing only empty lists.
      However I got a segmentation fault when I try to write in parquet thoses columns filled only with empty lists.

      Here is a simple code snipet reproduces the segmentation fault I had:

      In [1]: import pyarrow as pa
      
      In [2]: import pyarrow.parquet as pq
      
      In [3]: pa_ar = pa.array([[],[]],pa.list_(pa.int32()))
      
      In [4]: table = pa.Table.from_arrays([pa_ar],["test"])
      
      In [5]: pq.write_table(
         ...:     table=table,
         ...:     where="test.parquet",
         ...:     compression="snappy",
         ...:     flavor="spark"
         ...: )
      Segmentation fault
      
      

      May I have it fixed?

      Best

      Jacques

      Attachments

        Issue Links

          Activity

            People

              kszucs Krisztian Szucs
              jafournier jacques
              Votes:
              2 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 10m
                  1h 10m