Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
Impala 3.4.0
-
ghx-label-14
Description
When working on the Impala 3.4 release, we changed the version on branch-3.4.0 from 3.4.0-SNAPSHOT to 3.4.0-RELEASE.
metadata/test_stats_extrapolation.py::TestStatsExtrapolation::test_stats_extrapolation() now fails with the following error:
metadata/test_stats_extrapolation.py:44: in test_stats_extrapolation self.run_test_case('QueryTest/stats-extrapolation', vector, unique_database) common/impala_test_suite.py:690: in run_test_case self.__verify_results_and_errors(vector, test_section, result, use_db) common/impala_test_suite.py:523: in __verify_results_and_errors replace_filenames_with_placeholder) common/test_result_verifier.py:456: in verify_raw_results VERIFIER_MAP[verifier](expected, actual) common/test_result_verifier.py:246: in verify_query_result_is_subset assert expected_literal_strings <= actual_literal_strings E assert Items in expected results not found in actual results: E ' tuple-ids=0 row-size=4B cardinality=17.91K' E Items in actual results: E '| output exprs: id' E '' E ' table: rows=unavailable size=unavailable' E ' stored statistics:' E 'Max Per-Host Resource Reservation: Memory=8.00KB Threads=2' E ' columns: unavailable' E ' partitions: 0/24 rows=unavailable' E '00:SCAN HDFS [test_stats_extrapolation_5c6bdfd.alltypes]' E ' tuple-ids=0 row-size=4B cardinality=17.90K' E '|' E 'Analyzed query: SELECT id FROM test_stats_extrapolation_5c6bdfd.alltypes' E 'F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1' E ' HDFS partitions=24/24 files=36 size=281.43KB' E 'test_stats_extrapolation_5c6bdfd.alltypes' E 'PLAN-ROOT SINK' E '| mem-estimate=0B mem-reservation=0B thread-reservation=0' E '| Per-Host Resources: mem-estimate=16.00MB mem-reservation=8.00KB thread-reservation=2' E ' in pipelines: 00(GETNEXT)' E ' extrapolated-rows=unavailable max-scan-range-rows=unavailable' E 'Per-Host Resource Estimates: Memory=16MB' E 'WARNING: The following tables are missing relevant table and/or column statistics.' E ' mem-estimate=16.00MB mem-reservation=8.00KB thread-reservation=1'
The output is expecting a cardinality of 17.91K, but instead the cardinality is 17.90K.
The RELEASE version has one character fewer than the SNAPSHOT version. The version gets embedded in parquet files, so the parquet file is slightly smaller than before. The test is estimating cardinality by looking at the size of the parquet file. Apparently, this is right on the edge.
This test should tolerate this difference.