Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-10598

test_cache_reload_validation is flaky

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Test
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • Impala 4.0.0
    • None
    • ghx-label-4

    Description

      I noticed that when I run

      bin/impala-py.test tests/query_test/test_hdfs_caching.py -k test_cache_reload_validation
      

      I see a the following failure on master branch.

       TestHdfsCachingDdl.test_cache_reload_validation[protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: text/none] 
      tests/query_test/test_hdfs_caching.py:269: in test_cache_reload_validation
          assert num_entries_pre + 4 == get_num_cache_requests(), \
      E   AssertionError: Adding the tables should be reflected by the number of cache directives.
      E   assert (2 + 4) == 7
      E    +  where 7 = get_num_cache_requests()
      
      

      This failure is reproducible for me every time but I am not sure why the jenkins job don't show this test failure. When I looked into this I found that the test depends on the method
      get the number of cache directives on the hdfs.

        def get_num_cache_requests_util():
          rc, stdout, stderr = exec_process("hdfs cacheadmin -listDirectives -stats")
          assert rc == 0, 'Error executing hdfs cacheadmin: %s %s' % (stdout, stderr)
          return len(stdout.split('\n'))
      

      This output of this command when there are no entries is

      Found 0 entries
      

      when there are entries the output looks like

      Found 4 entries
        ID POOL       REPL EXPIRY  PATH                                                    BYTES_NEEDED  BYTES_CACHED  FILES_NEEDED  FILES_CACHED
       225 testPool      8 never   /test-warehouse/cachedb.db/cached_tbl_reload                       0             0             0             0
       226 testPool      8 never   /test-warehouse/cachedb.db/cached_tbl_reload_part                  0             0             0             0
       227 testPool      8 never   /test-warehouse/cachedb.db/cached_tbl_reload_part/j=1              0             0             0             0
       228 testPool      8 never   /test-warehouse/cachedb.db/cached_tbl_reload_part/j=2              0             0             0             0
      

      When there are no entries there is also a additional new line which is counted.
      So when there are no entries the method outputs 2 and when there are 4 entries the method outputs 7 which causes the failure because the test expects 2+4.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            vihangk1 Vihang Karajgaonkar
            vihangk1 Vihang Karajgaonkar
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment