Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-9094

Update test_hms_integration.py test_compute_stats_get_to_hive to account for separate Hive/Impala statistics

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • Impala 3.4.0
    • Impala 3.4.0
    • Frontend
    • None
    • ghx-label-5

    Description

      With newer Hive versions, Impala and Hive stats are kept separately and won't overwrite each other. test_hms_integration.py test_compute_stats_get_to_hive expects that Hive stats change when Impala does compute stats. test_compute_stats_get_to_impala expects that Impala stats change when Hive does compute stats. These tests need to be revised. Here are the example test failures:

      metadata/test_hms_integration.py:486: in test_compute_stats_get_to_hive
          assert hive_stats != self.hive_column_stats(table_name, 'x')
      E   assert {'# col_name': 'data_type', 'col_name': 'data_type', 'x': 'int'} != {'# col_name': 'data_type', 'col_name': 'data_type', 'x': 'int'}
      E    +  where {'# col_name': 'data_type', 'col_name': 'data_type', 'x': 'int'} = <bound method TestHmsIntegration.hive_column_stats of <test_hms_integration.TestHmsIntegration object at 0xe260e50>>('zbberubbydyldirc.fkqzvzekyqsjnflk', 'x')
      E    +    where <bound method TestHmsIntegration.hive_column_stats of <test_hms_integration.TestHmsIntegration object at 0xe260e50>> = <test_hms_integration.TestHmsIntegration object at 0xe260e50>.hive_column_stats

      If my theory is right, we should flip the test to make sure that Impala compute stats doesn't impact Hive and vice versa.

      Attachments

        Issue Links

          Activity

            People

              attilaj Attila Jeges
              joemcdonnell Joe McDonnell
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: