Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 2.7.0
Description
This is blocking nightly performance runs.
On August 2nd queries against tpch_nested_300_parquet started failing with
Invalid file. This file: hdfs://vb0202.halxg.cloudera.com:8020/user/hive/warehouse/tpch_nested_300_parquet.db/customer_snappy/000371_0 has no row groups Invalid file. This file: hdfs://vb0202.halxg.cloudera.com:8020/user/hive/warehouse/tpch_nested_300_parquet.db/customer_snappy/000466_0 has no row groups
These files appear to have invalid data given their size
-rw-r--r-- 3 mmokhtar hive 828 2016-02-23 12:48 /user/hive/warehouse/tpch_nested_300_parquet.db/customer_snappy/000371_0 -rw-r--r-- 3 mmokhtar hive 828 2016-02-23 12:49 /user/hive/warehouse/tpch_nested_300_parquet.db/customer_snappy/000466_0
Queries against the same dataset use to succeed before.
This is very likely a behavioral change introduced by http://github.mtv.cloudera.com/CDH/Impala/commit/40c01a7f92d2248229e8e45291a1ef43b8c40f48