[IMPALA-779] Incompatible type error when querying file created from AvroParquetWriter. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Minor
Resolution: Duplicate
Affects Version/s: Impala 1.2.1
Fix Version/s: None
Component/s: Backend
Labels:
- usability
Environment:
CDH4.3
Impala 1.2.1

Target Version:

Product Backlog

Description

Scenario:

1) Created Parquet file with AvroParquetWriter in code with 100 or so columns.
2) Created external table with Parquet against this file defined with only the first 4 columns and queried them all successfully.
3) Created second external table against this same file that was defined with the last 4 columns and the query blows up - complaining about the first column, and that wasn't even in the table definition.

[rd-namenode.explorys:21000] > select * from mytable2 limit 4;
Query: select * from mytable2 limit 4
ERROR: File hdfs://namenode:8021/user/doug.meil/parquet/mytable/regid=2/myfile.prq has an incompatible type with the table schema for column long1.  Expected type: BYTE_ARRAY.  Actual type: INT64
ERROR: Invalid query handle

The original Avro schema defined 'long1' like this...

{"name": "long1", "type": "long"},

The fact that the "Actual type" is INT64 seems correct - because I meant to put a long in there. Why does Impala think the expected type is a BYTE_ARRAY?

Note: summary queries (e.g., select count from mytable2) actually WORK. Go figure.

Attachments

Issue Links

is duplicated by

IMPALA-2835 Hive/Impala inconsistency with parquet.column.index.access=false

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Doug Meil

Votes:: 3 Vote for this issue

Watchers:: 9 Start watching this issue

Dates

Created:: 30/Jan/14 16:26

Updated:: 11/Jun/18 16:55

Resolved:: 11/Jun/18 16:55