Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-6670

Error in parquet record reader - previously readable file fails to be read in 1.14

    Details

      Description

      Parquet file which was generated by PyArrow was readable in Apache Drill 1.12 and 1.13, but fails to be read with 1.14.

      Running the query "SELECT * FROM dfs.`foo.parquet`" results in the following error message from the Drill web query UI:

      Query Failed: An Error Occurred
      
      org.apache.drill.common.exceptions.UserRemoteException: INTERNAL_ERROR ERROR: Error in parquet record reader. Message: Failure in setting up reader Parquet Metadata: ParquetMetaData{FileMetaData{schema: message schema { optional binary name (UTF8); optional binary creation_parameters (UTF8); optional int64 creation_date (TIMESTAMP_MICROS); optional int32 data_version; optional int32 schema_version; } , metadata: {pandas={"index_columns": [], "column_indexes": [], "columns": [{"name": "name", "field_name": "name", "pandas_type": "unicode", "numpy_type": "object", "metadata": null}, {"name": "creation_parameters", "field_name": "creation_parameters", "pandas_type": "unicode", "numpy_type": "object", "metadata": null}, {"name": "creation_date", "field_name": "creation_date", "pandas_type": "datetime", "numpy_type": "datetime64[ns]", "metadata": null}, {"name": "data_version", "field_name": "data_version", "pandas_type": "int32", "numpy_type": "int32", "metadata": null}, {"name": "schema_version", "field_name": "schema_version", "pandas_type": "int32", "numpy_type": "int32", "metadata": null}], "pandas_version": "0.22.0"}}}, blocks: [BlockMetaData{1, 27142 [ColumnMetaData{SNAPPY [name] optional binary name (UTF8) [PLAIN, RLE], 4}, ColumnMetaData{SNAPPY [creation_parameters] optional binary creation_parameters (UTF8) [PLAIN, RLE], 252}, ColumnMetaData{SNAPPY [creation_date] optional int64 creation_date (TIMESTAMP_MICROS) [PLAIN, RLE], 46334}, ColumnMetaData{SNAPPY [data_version] optional int32 data_version [PLAIN, RLE], 46478}, ColumnMetaData{SNAPPY [schema_version] optional int32 schema_version [PLAIN, RLE], 46593}]}]} Fragment 0:0 [Error Id: bdb2e4d5-5982-4cc6-b95e-244782f827d2 on f9d0456cddd2:31010] 
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                okalinin Oleksandr Kalinin
                Reporter:
                suicas Dave Challis
                Reviewer:
                Arina Ielchiieva
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: