Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-5039

test_mt_dop.py fails on local filesystem build

    Details

      Description

      From Jenkins:

      03:55:33 [gw2] FAILED query_test/test_mt_dop.py::TestMtDopParquetFiltering::test_parquet_filtering[exec_option: {'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0, 'batch_size': 0, 'num_nodes': 0} | table_format: parquet/none] 
      ...
      03:55:33 =================================== FAILURES ===================================
      03:55:33  TestMtDopParquetFiltering.test_parquet_filtering[exec_option: {'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0, 'batch_size': 0, 'num_nodes': 0} | table_format: parquet/none] 
      03:55:33 [gw2] linux2 -- Python 2.6.6 /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/../infra/python/env/bin/python
      03:55:33 query_test/test_mt_dop.py:115: in test_parquet_filtering
      03:55:33     self.run_test_case('QueryTest/mt-dop-parquet-filtering', vector)
      03:55:33 common/impala_test_suite.py:399: in run_test_case
      03:55:33     verify_runtime_profile(test_section['RUNTIME_PROFILE'], result.runtime_profile)
      03:55:33 common/test_result_verifier.py:490: in verify_runtime_profile
      03:55:33     actual))
      03:55:33 E   AssertionError: Did not find matches for lines in runtime profile:
      03:55:33 E   EXPECTED LINES:
      03:55:33 E   row_regex: .*NumRowGroups: 3.*
      ...
      
      1. JENKINS_CONSOLE.txt
        53 kB
        Alexander Behm

        Activity

        Hide
        alex.behm Alexander Behm added a comment -

        Joe McDonnell, was this caused by your Parquet dictionary filtering patch?

        Show
        alex.behm Alexander Behm added a comment - Joe McDonnell , was this caused by your Parquet dictionary filtering patch?
        Hide
        joemcdonnell Joe McDonnell added a comment -

        commit 6441ca65bda83c23dacfed8a27d944a0dabe6b65
        Author: Joe McDonnell <joemcdonnell@cloudera.com>
        Date: Tue Mar 7 12:20:09 2017 -0800

        IMPALA-5039: Fix variability in parquet dictionary filtering test

        The tests for dictionary filtering look at how many row groups are
        processed and how many are filtered by matching text in the profile.
        However, the number of row groups processed and filtered by any
        individual fragment depends on how the work is split and how many
        impalads are running. This causes variability in the test output.

        To fix this, the test needs a way to aggregate the results across
        fragments. This fix introduces the following syntax for specifying
        these aggregates:
        aggregate(function_name, field_name): expected_value
        This searches the runtime profile for lines that contain
        'field_name: number'. It skips the averaged fragment, as this is
        derived from all the other fragments.

        Currently, only SUM is implemented, and the expected_value is
        required to be an integer. It should be easy to implement other
        interesting functions like COUNT and MIN/MAX. It would also be
        possible to extend it to floats.

        Switching the dictionary filtering tests over to this new syntax
        eliminates the variability in the tests.

        Change-Id: I6b7b84d973b3ac678a24e82900f2637d569158bb
        Reviewed-on: http://gerrit.cloudera.org:8080/6301
        Tested-by: Impala Public Jenkins
        Reviewed-by: Alex Behm <alex.behm@cloudera.com>

        Show
        joemcdonnell Joe McDonnell added a comment - commit 6441ca65bda83c23dacfed8a27d944a0dabe6b65 Author: Joe McDonnell <joemcdonnell@cloudera.com> Date: Tue Mar 7 12:20:09 2017 -0800 IMPALA-5039 : Fix variability in parquet dictionary filtering test The tests for dictionary filtering look at how many row groups are processed and how many are filtered by matching text in the profile. However, the number of row groups processed and filtered by any individual fragment depends on how the work is split and how many impalads are running. This causes variability in the test output. To fix this, the test needs a way to aggregate the results across fragments. This fix introduces the following syntax for specifying these aggregates: aggregate(function_name, field_name): expected_value This searches the runtime profile for lines that contain 'field_name: number'. It skips the averaged fragment, as this is derived from all the other fragments. Currently, only SUM is implemented, and the expected_value is required to be an integer. It should be easy to implement other interesting functions like COUNT and MIN/MAX. It would also be possible to extend it to floats. Switching the dictionary filtering tests over to this new syntax eliminates the variability in the tests. Change-Id: I6b7b84d973b3ac678a24e82900f2637d569158bb Reviewed-on: http://gerrit.cloudera.org:8080/6301 Tested-by: Impala Public Jenkins Reviewed-by: Alex Behm <alex.behm@cloudera.com>

          People

          • Assignee:
            joemcdonnell Joe McDonnell
            Reporter:
            alex.behm Alexander Behm
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development