Hive
  1. Hive
  2. HIVE-5369 Annotate hive operator tree with statistics from metastore
  3. HIVE-5921

Better heuristics for worst case statistics estimates for join, limit and filter operator

    Details

    • Type: Sub-task Sub-task
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.13.0
    • Fix Version/s: 0.13.0
    • Component/s: Query Processor, Statistics
    • Labels:
      None

      Description

      This is a subtask of HIVE-5369. In worst case (i.e; absence of column statistics) HIVE-5849 improved the basic statistics with heuristics. But the heuristics failed to provide better estimates in few cases. For example: FILTER operator heuristics did not take into account the number of predicates and if the predicate contains partition column. Also, JOIN estimates were too aggressive and was not user configurable.

      1. HIVE-5921.1.patch
        532 kB
        Prasanth J
      2. HIVE-5921.2.patch
        532 kB
        Prasanth J
      3. HIVE-5921.3.patch
        545 kB
        Prasanth J
      4. HIVE-5921.4.patch
        544 kB
        Prasanth J

        Issue Links

          Activity

          Hide
          Prasanth J added a comment -

          FILTER rule is improved to evaluate each predicate expression. JOIN rule is improved to get hints from user in form of hive config. In absence of basic statistics (row count and data size), estimated row count/data size is computed from average row size which is computed from schema. Regenerated all affecting tests.

          Show
          Prasanth J added a comment - FILTER rule is improved to evaluate each predicate expression. JOIN rule is improved to get hints from user in form of hive config. In absence of basic statistics (row count and data size), estimated row count/data size is computed from average row size which is computed from schema. Regenerated all affecting tests.
          Hide
          Prasanth J added a comment -

          Making it as patch available for precommit tests

          Show
          Prasanth J added a comment - Making it as patch available for precommit tests
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12616842/HIVE-5921.1.patch

          ERROR: -1 due to 14 failed/errored test(s), 4449 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_partition_skip_default
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_multi_single_reducer3
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap3
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_auto
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lateral_view
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_predicate_pushdown
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_udf_case
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_between
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_short_regress
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_string_funcs
          org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_ppd_key_range
          org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_pushdown
          org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_ppd_key_ranges
          

          Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/499/testReport
          Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/499/console

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 14 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12616842

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12616842/HIVE-5921.1.patch ERROR: -1 due to 14 failed/errored test(s), 4449 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_partition_skip_default org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_multi_single_reducer3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_auto org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lateral_view org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_predicate_pushdown org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_udf_case org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_between org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_short_regress org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_string_funcs org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_ppd_key_range org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_pushdown org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_ppd_key_ranges Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/499/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/499/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 14 tests failed This message is automatically generated. ATTACHMENT ID: 12616842
          Hide
          Prasanth J added a comment -

          Fixed failing tests.

          Show
          Prasanth J added a comment - Fixed failing tests.
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12616903/HIVE-5921.2.patch

          ERROR: -1 due to 2 failed/errored test(s), 4453 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_partition_skip_default
          org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket_num_reducers
          

          Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/506/testReport
          Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/506/console

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 2 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12616903

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12616903/HIVE-5921.2.patch ERROR: -1 due to 2 failed/errored test(s), 4453 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_partition_skip_default org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket_num_reducers Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/506/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/506/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed This message is automatically generated. ATTACHMENT ID: 12616903
          Hide
          Prasanth J added a comment -

          Addressed Harish Butani's review comments.

          Show
          Prasanth J added a comment - Addressed Harish Butani 's review comments.
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12617062/HIVE-5921.3.patch

          ERROR: -1 due to 8 failed/errored test(s), 4457 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_1
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_2
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_udf
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32_lessSize
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join33
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_map_ppr
          org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket_num_reducers
          

          Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/518/testReport
          Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/518/console

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 8 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12617062

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12617062/HIVE-5921.3.patch ERROR: -1 due to 8 failed/errored test(s), 4457 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_udf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32_lessSize org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join33 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_map_ppr org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket_num_reducers Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/518/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/518/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed This message is automatically generated. ATTACHMENT ID: 12617062
          Hide
          Prasanth J added a comment -

          Fixed failing tests. decimal_udf.q failure is addressed in HIVE-5947.
          testMinimrCliDriver.testCliDriver_bucket_num_reducers seems to be unrelated to this jira.

          Show
          Prasanth J added a comment - Fixed failing tests. decimal_udf.q failure is addressed in HIVE-5947 . testMinimrCliDriver.testCliDriver_bucket_num_reducers seems to be unrelated to this jira.
          Hide
          Harish Butani added a comment -

          +1
          subject to all tests passing

          Show
          Harish Butani added a comment - +1 subject to all tests passing
          Hide
          Prasanth J added a comment -

          Reuploading patch for precommit tests.

          Show
          Prasanth J added a comment - Reuploading patch for precommit tests.
          Hide
          Hive QA added a comment -

          Overall: +1 all checks pass

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12617115/HIVE-5921.4.patch

          SUCCESS: +1 4457 tests passed

          Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/528/testReport
          Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/528/console

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          

          This message is automatically generated.

          ATTACHMENT ID: 12617115

          Show
          Hive QA added a comment - Overall : +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12617115/HIVE-5921.4.patch SUCCESS: +1 4457 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/528/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/528/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase This message is automatically generated. ATTACHMENT ID: 12617115
          Hide
          Ashutosh Chauhan added a comment -

          Committed to trunk. Thanks, Prasanth!

          Show
          Ashutosh Chauhan added a comment - Committed to trunk. Thanks, Prasanth!

            People

            • Assignee:
              Prasanth J
              Reporter:
              Prasanth J
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development