[PARQUET-1594] Parquet File is not able to read from Spark and Hive - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: parquet-mr
Labels:
None

Description

Issue: Caused by: java.io.IOException: Expected 35393 values in column chunk at maprfs:////path/date=20190605/caa63aa9-abfa-46e1-8221-10f6c669512d.parquet offset 4 but got 46402 values instead over 2 pages ending at file offset 341624

we are getting Avro Serialized messages from kafka which are being consumed by Spring-kafka and converted into parquet gets persisted into MaprFS(hdfs) file system.

i have tried replicating the issue in local with same Avro file but i was able to read parquet successfully, I am not sure why the parquet being corrupted in HDFS .

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Prashanth pampanna desai

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 11/Jun/19 21:01

Updated:: 31/Mar/21 15:50