Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Won't Fix
-
Impala 2.12.0
-
None
-
None
-
ghx-label-5
Description
An empty Parquet file, with no rows in it causing a warning in explain:
WARNING: The following tables have potentially corrupt table statistics. Drop and re-compute statistics to resolve this problem.
This Warning is showing even after
compute stats tp;
because:
partitions=1/1 files=1 size=220B
but numRows = 0.
A simple reproduction:
create table tp (a int) stored as parquet;
create and empty.csv file
create parquet file from the csv with a simple MR job:
using the following schema:
"{\n" + " \"type\": \"record\",\n" + " \"name\": \"tp\",\n" + " \"doc\": \"Avro schema for table tp\",\n" + " \"fields\":\n" + " [\n" + " {\"name\": \"a\", \"type\": \"int\"}\n"+ " ]\n"+ "}\n");
Put the output Parquet file (PFA) onto the HDFS, then
compute stats tp; explain select * from tp;