Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-132

AvroParquetInputFormat should use a parameterized type

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.6.0
    • None
    • None

    Description

      The AvroParquetInputFormat currently extends ParquetInputFormat<IndexedRecord>, which works for regular MR cases. But Spark's hadoopRDD and newAPIHadoopRDD methods (correctly) create a RDD with the types from the InputFormat. This means that the RDD always uses IndexedRecord rather than the correct type.

      The AvroParquetInputFormat should be AvroParquetInputFormat<T extends IndexedRecord> extends ParquetInputFormat<T>

      Attachments

        Activity

          People

            Unassigned Unassigned
            rdblue Ryan Blue
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: