Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.7.6
-
None
-
None
Description
When running queries on truncated files, Impala's Avro scanner issues a warning:
WARNINGS: Problem parsing file hdfs://host.company.com:8020/tmp/datagen/some_db/some_table/col1=A/col2=B/col3=D/col4=C/2017-05-18-18-5-9-876-0.avro at 1327214080(EOF) Tried to read 64653 bytes but could only read 16549 bytes. This may indicate data file corruption. (file hdfs://host.company.com:8020/tmp/datagen/some_db/some_table/col1=A/col2=B/col3=D/col4=C/2017-05-18-18-5-9-876-0.avro, byte offset: 1327214080)
avro-tools tojson eventually prints the same number of rows that Impala reads, but does not print a warning. Instead it seems to quietly swallow the EOFException.
I think it should print a warning instead.