[HIVE-14306] Hive Failed to read Parquet Files generated by SparkSQL - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 1.2.1
Fix Version/s: None
Component/s: CLI
Labels:
None

Description

I'm trying to implement the following process:

1. create a hive parquet table A use hive CLI
2. create a external table B whose schema just like A, but point to a exist folder which contains one csv file in HDSF
3. execute `insert into A select * from B` using SparkSQL
4. query table A.

wired thing happens in step 3 and 4。

If the 'insert into' statement executed by SparkSQL(1.6.2)，Hive CLI would throw me an Exception when querying table A
```
Failed with exception java.io.IOException:parquet.io.ParquetDecodingException: Can not read value at 0 in block -1 in file hdfs://NEOInciteDataNode-1:8020/user/hive/warehouse/call_center/part-r-00000-b9b6962d-cbab-452b-835b-c10c6221b8fa.gz.parquet
```

But SparkSQL can query table A without trouble...

If the `insert` statement executed by Hive CLI， query table A in Hive CLI would be just fine...

So am I doing something wrong, or this is just a bug?

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Teng Yutong

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 21/Jul/16 07:58

Updated:: 21/Jul/16 07:59