[PARQUET-132] AvroParquetInputFormat should use a parameterized type - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.6.0
Component/s: None
Labels:
None

Description

The AvroParquetInputFormat currently extends ParquetInputFormat<IndexedRecord>, which works for regular MR cases. But Spark's hadoopRDD and newAPIHadoopRDD methods (correctly) create a RDD with the types from the InputFormat. This means that the RDD always uses IndexedRecord rather than the correct type.

The AvroParquetInputFormat should be AvroParquetInputFormat<T extends IndexedRecord> extends ParquetInputFormat<T>

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Ryan Blue

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 13/Nov/14 17:57

Updated:: 23/Jun/24 03:27

Resolved:: 19/Nov/14 04:20