Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21666

Cannot handle Parquet type FIXED_LEN_BYTE_ARRAY

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 2.2.0
    • None
    • SQL
    • None

    Description

      I have a parquet schema that looks like this:

      {{ optional group connection

      { required fixed_len_byte_array(6) localMacAddress; required fixed_len_byte_array(6) remoteMacAddress; }

      }}

      When I try to load this parquet file in Spark, I get:
      Caused by: org.apache.spark.sql.AnalysisException: Illegal Parquet type: FIXED_LEN_BYTE_ARRAY;
      at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.illegalType$1(ParquetSchemaConverter.scala:126)
      at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertPrimitiveField(ParquetSchemaConverter.scala:193)
      at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:108)
      at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$2.apply(ParquetSchemaConverter.scala:90)
      at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$2.apply(ParquetSchemaConverter.scala:84)

      We are not able to change the schema so this issue prevents us from processing the data.

      Duplicate of https://issues.apache.org/jira/browse/SPARK-2489

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              praetp Paul Praet
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: