Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-1594

Parquet File is not able to read from Spark and Hive

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • parquet-mr
    • None

    Description

      Issue: Caused by: java.io.IOException: Expected 35393 values in column chunk at maprfs:////path/date=20190605/caa63aa9-abfa-46e1-8221-10f6c669512d.parquet offset 4 but got 46402 values instead over 2 pages ending at file offset 341624 

      we are getting Avro Serialized messages from kafka which are being consumed by Spring-kafka and converted into parquet gets persisted into MaprFS(hdfs) file system. 

      i have tried replicating the issue in local with same Avro file but i was able to read parquet successfully, I am not sure why the parquet being corrupted in HDFS . 

       

       

       

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            Prashanth Desai Prashanth pampanna desai
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: