[IMPALA-11114] calculate_tval fails with ZeroDevisionError if the standard deviations are 0 - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: None
Fix Version/s: Impala 4.1.0
Component/s: None
Labels:
None

Epic Color:
ghx-label-3

Description

Possible cause:

Rounding of the data or other forms of truncation could give zero standard deviation when in fact you have some. And if the difference that you are trying to measure is within your measurement error that is a problem not addressed by the t-test.

https://stats.stackexchange.com/questions/78570/t-test-with-sample-standard-deviation-of-zero-possible/275879

Full log:

Traceback (most recent call last):
  File "/home/gfurnstahl/Impala/tests/benchmark/report_benchmark_results.py", line 1131, in <module>
    report = Report(grouped, ref_grouped)
  File "/home/gfurnstahl/Impala/tests/benchmark/report_benchmark_results.py", line 494, in __init__
    self.__analyze()
  File "/home/gfurnstahl/Impala/tests/benchmark/report_benchmark_results.py", line 514, in __analyze
    query_comparison_row = Report.QueryComparisonRow(results, ref_results)
  File "/home/gfurnstahl/Impala/tests/benchmark/report_benchmark_results.py", line 370, in __init__
    self.__check_perf_change_significance(results, ref_results))
  File "/home/gfurnstahl/Impala/tests/benchmark/report_benchmark_results.py", line 390, in __check_perf_change_significance
    ref_stat[AVG], ref_stat[STDDEV], ref_stat[ITERATIONS])
  File "/home/gfurnstahl/Impala/tests/util/calculation_util.py", line 65, in calculate_tval
    return (avg - ref_avg) / sem
ZeroDivisionError: float division by zero
Traceback (most recent call last):
  File "bin/single_node_perf_run.py", line 359, in <module>
    main()
  File "bin/single_node_perf_run.py", line 349, in main
    perf_ab_test(options, args)
  File "bin/single_node_perf_run.py", line 267, in perf_ab_test
    compare(temp_dir, hash_a, hash_b)
  File "bin/single_node_perf_run.py", line 175, in compare
    report_benchmark_results(file_a, file_b, description)
  File "bin/single_node_perf_run.py", line 166, in report_benchmark_results
    stdout=f)
  File "/home/gfurnstahl/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/subprocess.py", line 190, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/home/gfurnstahl/Impala/tests/benchmark/report_benchmark_results.py', '--reference_result_file=/home/gfurnstahl/Impala/perf_results/perf_run_0SdUw7/a87f8c5df9f6fbf8d468921642d7ec3d37c5f4de.json', '--input_result_file=/home/gfurnstahl/Impala/perf_results/perf_run_0SdUw7/b4d04112559c3f04ebf42b36deb1cd537dea78c4.json', '--report_description="a87f8c5df9f6fbf8d468921642d7ec3d37c5f4de vs b4d04112559c3f04ebf42b36deb1cd537dea78c4"']' returned non-zero exit status 1

Attachments

Activity

People

Assignee:: Gergely Fürnstáhl

Reporter:: Gergely Fürnstáhl

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 09/Feb/22 10:56

Updated:: 03/Aug/22 09:42

Resolved:: 03/Aug/22 09:42