Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-3208

[C++] Segmentation fault when casting dictionary to numeric with nullptr valid_bitmap

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.9.0
    • Fix Version/s: 0.13.0
    • Component/s: C++
    • Environment:
      Ubuntu 16.04 LTS; System76 Oryx Pro

      Description

      Steps to reproduce:

      1. Create a partitioned dataset with the following code:

      ```python

      import numpy as np

      import pandas as pd

      import pyarrow as pa

      import pyarrow.parquet as pq

      df = pd.DataFrame(

      { 'one': [-1, 10, 2.5, 100, 1000, 1, 29.2], 'two': [-1, 10, 2, 100, 1000, 1, 11], 'three': [0, 0, 0, 0, 0, 0, 0] }

      )

      table = pa.Table.from_pandas(df)

      pq.write_to_dataset(table, root_path='/home/yingw787/misc/example_dataset', partition_cols=['one', 'two'])

      ```

      1. Create a Parquet file from a PyArrow Table created from the partitioned Parquet dataset:

      ```python

      import pyarrow.parquet as pq

      table = pq.ParquetDataset('/path/to/dataset').read()

      pq.write_table(table, '/path/to/example.parquet')

      ```

      EXPECTED:

      • Successful write

      GOT:

      • Segmentation fault

      Issue reference on GitHub mirror: https://github.com/apache/arrow/issues/2511

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                fsaintjacques Francois Saint-Jacques
                Reporter:
                yingw787 Ying Wang
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m