Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-9759

Revisit integration of snapshot dataload with s3guard

    XMLWordPrintableJSON

Details

    • ghx-label-4

    Description

      Sometimes, the s3 jobs (which use s3guard for consistency) sees test failures due to missing files from the dataload snapshot (see bottom). This may be related to the interaction of snapshot loading with s3guard. We should nail down exactly the right procedure for loading the snapshot. Currently, we do the following:
      1. Remove any data from the s3bucket via the s3 commandline
      2. Create the s3guard dynamodb table (or reuse existing one if a previous job failed without deleting the old dynamodb table)
      3. Prune any existing entries from that table
      4. Load the snapshot to the s3 bucket

      In theory, this leave s3guard with an empty dynamodb table and an s3bucket with data. As tests progress and try to access the s3 bucket, s3guard would see that there is no entry in the dynamodb table and then check the underlying s3 bucket.

      We need to revisit these steps and verify that everything is being done correctly.

      metadata/test_metadata_query_statements.py:70: in test_show_stats
          self.run_test_case('QueryTest/show-stats', vector, "functional")
      common/impala_test_suite.py:687: in run_test_case
          self.__verify_results_and_errors(vector, test_section, result, use_db)
      common/impala_test_suite.py:523: in __verify_results_and_errors
          replace_filenames_with_placeholder)
      common/test_result_verifier.py:456: in verify_raw_results
          VERIFIER_MAP[verifier](expected, actual)
      common/test_result_verifier.py:278: in verify_query_result_is_equal
          assert expected_results == actual_results
      E assert Comparing QueryTestResults (expected vs actual):
      E '2009','1',310,1,'19.95KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=1' == '2009','1',310,1,'19.95KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=1'
      E '2009','10',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=10' == '2009','10',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=10'
      E '2009','11',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=11' == '2009','11',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=11'
      E '2009','12',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=12' == '2009','12',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=12'
      E '2009','2',280,1,'18.12KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=2' == '2009','2',280,1,'18.12KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=2'
      E '2009','3',310,1,'20.06KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=3' == '2009','3',310,1,'20.06KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=3'
      E '2009','4',300,1,'19.61KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=4' == '2009','4',300,1,'19.61KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=4'
      E '2009','5',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=5' != '2009','5',0,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=5'
      E '2009','6',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=6' == '2009','6',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=6'
      E '2009','7',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=7' == '2009','7',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=7'
      E '2009','8',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=8' == '2009','8',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=8'
      E '2009','9',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=9' == '2009','9',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=9'
      E '2010','1',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=1' == '2010','1',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=1'
      E '2010','10',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=10' == '2010','10',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=10'
      E '2010','11',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=11' == '2010','11',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=11'
      E '2010','12',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=12' == '2010','12',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=12'
      E '2010','2',280,1,'18.39KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=2' == '2010','2',280,1,'18.39KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=2'
      E '2010','3',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=3' == '2010','3',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=3'
      E '2010','4',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=4' == '2010','4',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=4'
      E '2010','5',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=5' == '2010','5',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=5'
      E '2010','6',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=6' == '2010','6',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=6'
      E '2010','7',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=7' == '2010','7',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=7'
      E '2010','8',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=8' == '2010','8',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=8'
      E '2010','9',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=9' == '2010','9',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=9'
      E 'Total','',7300,24,'478.45KB','0B','','','','' != 'Total','',6990,24,'478.45KB','0B','','','',''
      

      This also shows up in cardinality calculations:

      metadata/test_explain.py:113: in test_explain_validate_cardinality_estimates
          check_cardinality(result.data, '7.30K')
      metadata/test_explain.py:98: in check_cardinality
          query_result, expected_cardinality=expected_cardinality)
      metadata/test_explain.py:86: in check_row_size_and_cardinality
          assert m.groups()[1] == expected_cardinality
      E assert '6.99K' == '7.30K'
      E - 6.99K
      E + 7.30K
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              joemcdonnell Joe McDonnell
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: