Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-5319

data_errors/test_data_errors.py::TestHdfsScanNodeErrors failing on asf-master-exhaustive

    Details

    • Epic Color:
      ghx-label-5

      Description

      Lots of issues related to column types that appear to have changed.

      Suspect https://gerrit.cloudera.org/#/c/6526/

      Hopefully this is just a simple test fix.

        TestHdfsScanNodeErrors.test_hdfs_scan_node_errors[exec_option: {'disable_codegen': True, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0, 'batch_size': 1, 'num_nodes': 0} | table_format: text/gzip/block] 
      10:10:44 [gw2] linux2 -- Python 2.6.6 /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/bin/../infra/python/env/bin/python
      10:10:44 data_errors/test_data_errors.py:83: in test_hdfs_scan_node_errors
      10:10:44     self.run_test_case('DataErrorsTest/hdfs-scan-node-errors', vector)
      10:10:44 common/impala_test_suite.py:400: in run_test_case
      10:10:44     self.__verify_results_and_errors(vector, test_section, result, use_db)
      10:10:44 common/impala_test_suite.py:273: in __verify_results_and_errors
      10:10:44     replace_filenames_with_placeholder)
      10:10:44 common/test_result_verifier.py:317: in verify_raw_results
      10:10:44     verify_errors(expected_errors, actual_errors)
      10:10:44 common/test_result_verifier.py:274: in verify_errors
      10:10:44     VERIFIER_MAP['VERIFY_IS_EQUAL'](expected, actual)
      10:10:44 common/test_result_verifier.py:231: in verify_query_result_is_equal
      10:10:44     assert expected_results == actual_results
      10:10:44 E   assert Comparing QueryTestResults (expected vs actual):
      10:10:44 E     'Error converting column: 1 to BOOLEAN' == 'Error converting column: 1 to BOOLEAN'
      10:10:44 E     'Error converting column: 1 to BOOLEAN' == 'Error converting column: 1 to BOOLEAN'
      10:10:44 E     'Error converting column: 10 to TIMESTAMP' != 'Error converting column: 2 to TINYINT'
      10:10:44 E     'Error converting column: 10 to TIMESTAMP' != 'Error converting column: 2 to TINYINT'
      10:10:44 E     'Error converting column: 10 to TIMESTAMP' != 'Error converting column: 2 to TINYINT'
      10:10:44 E     'Error converting column: 10 to TIMESTAMP' != 'Error converting column: 2 to TINYINT'
      10:10:44 E     'Error converting column: 2 to TINYINT' != 'Error converting column: 3 to SMALLINT'
      10:10:44 E     'Error converting column: 2 to TINYINT' != 'Error converting column: 3 to SMALLINT'
      10:10:44 E     'Error converting column: 2 to TINYINT' != 'Error converting column: 3 to SMALLINT'
      10:10:44 E     'Error converting column: 2 to TINYINT' != 'Error converting column: 4 to INT'
      10:10:44 E     'Error converting column: 3 to SMALLINT' != 'Error converting column: 4 to INT'
      10:10:44 E     'Error converting column: 3 to SMALLINT' != 'Error converting column: 4 to INT'
      10:10:44 E     'Error converting column: 3 to SMALLINT' != 'Error converting column: 4 to INT'
      10:10:44 E     'Error converting column: 4 to INT' != 'Error converting column: 5 to BIGINT'
      10:10:44 E     'Error converting column: 4 to INT' != 'Error converting column: 5 to BIGINT'
      10:10:44 E     'Error converting column: 4 to INT' != 'Error converting column: 6 to FLOAT'
      10:10:44 E     'Error converting column: 4 to INT' != 'Error converting column: 6 to FLOAT'
      10:10:44 E     'Error converting column: 5 to BIGINT' != 'Error converting column: 6 to FLOAT'
      10:10:44 E     'Error converting column: 5 to BIGINT' != 'Error converting column: 7 to DOUBLE'
      10:10:44 E     'Error converting column: 6 to FLOAT' != 'Error converting column: 7 to DOUBLE'
      10:10:44 E     'Error converting column: 6 to FLOAT' != 'Error converting column: 7 to DOUBLE'
      10:10:44 E     'Error converting column: 6 to FLOAT' != 'Error converting column: 7 to DOUBLE'
      10:10:44 E     'Error converting column: 7 to DOUBLE' != 'Error parsing row: file: hdfs://localhost:20500/test-warehouse/alltypeserrornonulls_text_gzip/year=2009/month=1/000001_0.gz, before offset: 228'
      10:10:44 E     'Error converting column: 7 to DOUBLE' != 'Error parsing row: file: hdfs://localhost:20500/test-warehouse/alltypeserrornonulls_text_gzip/year=2009/month=1/000001_0.gz, before offset: 228'
      10:10:44 E     'Error converting column: 7 to DOUBLE' != 'Error parsing row: file: hdfs://localhost:20500/test-warehouse/alltypeserrornonulls_text_gzip/year=2009/month=1/000001_0.gz, before offset: 228'
      10:10:44 E     'Error converting column: 7 to DOUBLE' != 'Error parsing row: file: hdfs://localhost:20500/test-warehouse/alltypeserrornonulls_text_gzip/year=2009/month=1/000001_0.gz, before offset: 228'
      

        Activity

        Hide
        zamsden Zach Amsden added a comment -

        This is probably simply a matter of updating the test data:

        testdata/workloads/functional-query/queries/DataErrorsTest/hdfs-scan-node-errors.test

        Show
        zamsden Zach Amsden added a comment - This is probably simply a matter of updating the test data: testdata/workloads/functional-query/queries/DataErrorsTest/hdfs-scan-node-errors.test
        Hide
        zamsden Zach Amsden added a comment -

        Looks like the unrelated conversion failures are cascading failures due to the lexical order sorted results getting thrown off by the timestamp change.

        Show
        zamsden Zach Amsden added a comment - Looks like the unrelated conversion failures are cascading failures due to the lexical order sorted results getting thrown off by the timestamp change.
        Hide
        zamsden Zach Amsden added a comment -

        Problem appears pretty simple to fix, I'll take care of it.

        Caveat: I can't build impala right now to test it.

        Show
        zamsden Zach Amsden added a comment - Problem appears pretty simple to fix, I'll take care of it. Caveat: I can't build impala right now to test it.
        Hide
        mjacobs Matthew Jacobs added a comment -

        I think the issue is actually that alltypeserror was modified by https://gerrit.cloudera.org/#/c/6526/11/testdata/datasets/functional/functional_schema_template.sql

        I'll post a fix

        Show
        mjacobs Matthew Jacobs added a comment - I think the issue is actually that alltypeserror was modified by https://gerrit.cloudera.org/#/c/6526/11/testdata/datasets/functional/functional_schema_template.sql I'll post a fix
        Hide
        zamsden Zach Amsden added a comment -

        I miscounted columns and thought the suspect change made column 10 into a timestamp column, in which case the conversion from string would not be happening so the errors would disappear. Looks like this wasn't actually the case, there was an actual problem somewhere.

        Show
        zamsden Zach Amsden added a comment - I miscounted columns and thought the suspect change made column 10 into a timestamp column, in which case the conversion from string would not be happening so the errors would disappear. Looks like this wasn't actually the case, there was an actual problem somewhere.
        Hide
        mjacobs Matthew Jacobs added a comment -

        commit 7c368999f8d991b51d440e42a7a4aafab034fe39
        Author: Matthew Jacobs <mj@cloudera.com>
        Date: Mon May 15 14:45:02 2017 -0700

        IMPALA-5319: Fix test_hdfs_scan_node_errors failures

        The recent Kudu TIMESTAMP patch (IMPALA-5137) made an
        inadvertent change [1] to alltypeserror_tmp and
        alltypeserrornonulls_tmp, changing 'timestamp_col' from
        STRING to TIMESTAMP.

        This seems to cause failures on exhaustive jobs which run
        test_hdfs_scan_node_errors against all file-formats.
        I haven't been able to reproduce this failure myself, so
        cannot test whether this fixes the jobs that are failing, but
        this change to revert these tables seems warranted given
        they were changed inadvertently.

        1: https://gerrit.cloudera.org/#/c/6526/11/testdata/datasets/functional/functional_schema_template.sql

        Change-Id: I533f1921662802ea6e076eefac973f50c014fcb5
        Reviewed-on: http://gerrit.cloudera.org:8080/6891

        Show
        mjacobs Matthew Jacobs added a comment - commit 7c368999f8d991b51d440e42a7a4aafab034fe39 Author: Matthew Jacobs <mj@cloudera.com> Date: Mon May 15 14:45:02 2017 -0700 IMPALA-5319 : Fix test_hdfs_scan_node_errors failures The recent Kudu TIMESTAMP patch ( IMPALA-5137 ) made an inadvertent change [1] to alltypeserror_tmp and alltypeserrornonulls_tmp, changing 'timestamp_col' from STRING to TIMESTAMP. This seems to cause failures on exhaustive jobs which run test_hdfs_scan_node_errors against all file-formats. I haven't been able to reproduce this failure myself, so cannot test whether this fixes the jobs that are failing, but this change to revert these tables seems warranted given they were changed inadvertently. 1: https://gerrit.cloudera.org/#/c/6526/11/testdata/datasets/functional/functional_schema_template.sql Change-Id: I533f1921662802ea6e076eefac973f50c014fcb5 Reviewed-on: http://gerrit.cloudera.org:8080/6891

          People

          • Assignee:
            mjacobs Matthew Jacobs
            Reporter:
            zamsden Zach Amsden
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development