Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-918

FromParquetSchema API crashes on nested schemas

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • cpp-1.0.0
    • cpp-1.1.0
    • parquet-cpp
    • None

    Description

      FromParquetSchema@src/parquet/arrow/schema.cc:276 misbehaves by using its column_indices parameter in the second version of the function as indices to the direct schema root fields.
      This is problematic with nested schema parquet files - the bug crashes the process by accessing the fields vector out of bounds.

      This bug is masked by another bug in the first version of the FromParquetSchema function which constructs a complete indices list the size of the number of schema fields (instead of the # of columns).

      The bug is triggered in many significant use-cases, for example when using the arrow::ReadTable API.

      Attachments

        Activity

          People

            itaiin Itai Incze
            itaiin Itai Incze
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: