Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Issue: Caused by: java.io.IOException: Expected 35393 values in column chunk at maprfs:////path/date=20190605/caa63aa9-abfa-46e1-8221-10f6c669512d.parquet offset 4 but got 46402 values instead over 2 pages ending at file offset 341624
we are getting Avro Serialized messages from kafka which are being consumed by Spring-kafka and converted into parquet gets persisted into MaprFS(hdfs) file system.
i have tried replicating the issue in local with same Avro file but i was able to read parquet successfully, I am not sure why the parquet being corrupted in HDFS .