[PARQUET-647] Null Pointer Exception in Hive upon reading Parquet - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Blocker
Resolution: Not A Problem
Affects Version/s: 1.6.0
Fix Version/s: None
Component/s: parquet-format, parquet-mr
Labels:
- hadoop
- hive
- nullpointerexception
- parquet
- spark
Environment:

Hadoop 2.6
Hive 0.14
Parquet 1.6
SPARK 1.6.1
Scala 2.11

Description

When I write Parquet files from Spark Job, and try to read it in Hive as an External Table , I get Null Pointer Exception. After further analysis , I found I had some Null values in my transformation(used Dataset and DataFrame API's) before saving to parquet. These 2 fields which contains NULL are float data types. When I removed these two columns from the parquet datasets, I was able to read it in hive. Contrastingly , with all NULL columns I was able to read it Hive when I write my job to ORC format.
When a datatype is anything other than String , which is completely empty(NULL) written in parquet is not been able to read by Hive and throws NP Exception.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

Screen Shot 2016-06-24 at 11.03.56 AM.png
29/Jun/16 00:10
136 kB
Mahadevan Sudarsanan
Screen Shot 2016-06-24 at 11.02.50 AM.png
29/Jun/16 00:10
71 kB
Mahadevan Sudarsanan
Screen Shot 2016-06-24 at 11.01.55 AM.png
29/Jun/16 00:10
22 kB
Mahadevan Sudarsanan
Screen Shot 2016-06-24 at 11.01.46 AM.png
29/Jun/16 00:10
903 kB
Mahadevan Sudarsanan

Activity

People

Assignee:: Unassigned

Reporter:: Mahadevan Sudarsanan

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 28/Jun/16 23:53

Updated:: 08/Sep/16 16:36

Resolved:: 08/Sep/16 16:36