Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-9798

TestScratchDir.test_multiple_dirs fails to start impalad

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • Impala 4.0.0
    • None
    • ghx-label-13

    Description

      Saw in an exhaustive job:
      Stacktrace:

      custom_cluster/test_scratch_disk.py:97: in test_multiple_dirs
          '--impalad_args=--disk_spill_punch_holes=true'])
      common/custom_cluster_test_suite.py:277: in _start_impala_cluster
          check_call(cmd + options, close_fds=True)
      /data/jenkins/workspace/impala-asf-master-exhaustive/Impala-Toolchain/python-2.7.16/lib/python2.7/subprocess.py:190: in check_call
          raise CalledProcessError(retcode, cmd)
      E   CalledProcessError: Command '['/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/bin/start-impala-cluster.py', '--state_store_args=--statestore_update_frequency_ms=50     --statestore_priority_update_frequency_ms=50     --statestore_heartbeat_frequency_ms=50', '--cluster_size=3', '--num_coordinators=3', '--log_dir=/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests', '--log_level=1', '--impalad_args=-logbuflevel=-1 -scratch_dirs=/tmp/tmpR006lp,/tmp/tmpzKVBYt,/tmp/tmpBLcN_O,/tmp/tmp6kqoj5,/tmp/tmpT_R39r', '--impalad_args=--allow_multiple_scratch_dirs_per_device=false', '--impalad_args=--disk_spill_compression_codec=zstd', '--impalad_args=--disk_spill_punch_holes=true', '--impalad_args=--default_query_options=']' returned non-zero exit status 1
      

      Standard Output:

      Generated dir/tmp/tmpR006lp
      Generated dir/tmp/tmpzKVBYt
      Generated dir/tmp/tmpBLcN_O
      Generated dir/tmp/tmp6kqoj5
      Generated dir/tmp/tmpT_R39r
      

      Standard Error:

      15:14:51 MainThread: Found 0 impalad/0 statestored/0 catalogd process(es)
      15:14:51 MainThread: Starting State Store logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/statestored.INFO
      15:14:51 MainThread: Starting Catalog Service logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/catalogd.INFO
      15:14:51 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad.INFO
      15:14:51 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO
      15:14:51 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO
      15:14:54 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
      15:14:54 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
      15:14:54 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-0b8f.vpc.cloudera.com:25000
      15:14:54 MainThread: 'backends'
      15:14:54 MainThread: Waiting for num_known_live_backends=3. Current value: None
      15:14:55 MainThread: Found 2 impalad/1 statestored/1 catalogd process(es)
      15:14:55 MainThread: Error starting cluster
      Traceback (most recent call last):
        File "/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/bin/start-impala-cluster.py", line 770, in <module>
          expected_cluster_size - expected_catalog_delays)
        File "/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/tests/common/impala_cluster.py", line 186, in wait_until_ready
          early_abort_fn=check_processes_still_running)
        File "/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/tests/common/impala_service.py", line 284, in wait_for_num_known_live_backends
          early_abort_fn()
        File "/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/tests/common/impala_cluster.py", line 178, in check_processes_still_running
          assert len(self.impalads) >= expected_num_impalads
      AssertionError
      DEBUG:impala_cluster:Found 2 impalad/1 statestored/1 catalogd process(es)
      

      Looking into the crashed impalad's log:

      I0528 15:14:54.587469 15245 tmp-file-mgr.cc:229] Using scratch directory /tmp/tmpR006lp/impala-scratch on disk 0 limit: 8589934592.00 GB
      I0528 15:14:54.648952 15245 status.cc:129] Failed to get post-punch file size: Not found: /tmp/tmpR006lp/impala-scratch/88432f73256ff458:c620697eade771bb: No such file or directory (error 2)
          @          0x1d5b072  impala::Status::Status()
          @          0x264fa09  impala::FileSystemUtil::CheckHolePunch()
          @          0x22e6947  impala::TmpFileMgr::InitCustom()
          @          0x22e59a4  impala::TmpFileMgr::InitCustom()
          @          0x22e58f0  impala::TmpFileMgr::Init()
          @          0x248835c  impala::ImpalaServer::ImpalaServer()
          @          0x2483a34  ImpaladMain()
          @          0x1d048af  main
          @     0x7f23f3a64c04  __libc_start_main
          @          0x1d04726  (unknown)
      E0528 15:14:54.649108 15245 impala-server.cc:394] Failed to get post-punch file size: Not found: /tmp/tmpR006lp/impala-scratch/88432f73256ff458:c620697eade771bb: No such file or directory (error 2)
      E0528 15:14:54.649127 15245 impala-server.cc:397] Aborting Impala Server startup due to improperly configured scratch directories.. Impalad exiting.
      

      It looks like the scratch dir is not created successfully.

      Attachments

        Issue Links

          Activity

            People

              tarmstrong Tim Armstrong
              stigahuang Quanlong Huang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: