Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
None
-
None
-
ghx-label-13
Description
Saw in an exhaustive job:
Stacktrace:
custom_cluster/test_scratch_disk.py:97: in test_multiple_dirs '--impalad_args=--disk_spill_punch_holes=true']) common/custom_cluster_test_suite.py:277: in _start_impala_cluster check_call(cmd + options, close_fds=True) /data/jenkins/workspace/impala-asf-master-exhaustive/Impala-Toolchain/python-2.7.16/lib/python2.7/subprocess.py:190: in check_call raise CalledProcessError(retcode, cmd) E CalledProcessError: Command '['/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/bin/start-impala-cluster.py', '--state_store_args=--statestore_update_frequency_ms=50 --statestore_priority_update_frequency_ms=50 --statestore_heartbeat_frequency_ms=50', '--cluster_size=3', '--num_coordinators=3', '--log_dir=/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests', '--log_level=1', '--impalad_args=-logbuflevel=-1 -scratch_dirs=/tmp/tmpR006lp,/tmp/tmpzKVBYt,/tmp/tmpBLcN_O,/tmp/tmp6kqoj5,/tmp/tmpT_R39r', '--impalad_args=--allow_multiple_scratch_dirs_per_device=false', '--impalad_args=--disk_spill_compression_codec=zstd', '--impalad_args=--disk_spill_punch_holes=true', '--impalad_args=--default_query_options=']' returned non-zero exit status 1
Standard Output:
Generated dir/tmp/tmpR006lp Generated dir/tmp/tmpzKVBYt Generated dir/tmp/tmpBLcN_O Generated dir/tmp/tmp6kqoj5 Generated dir/tmp/tmpT_R39r
Standard Error:
15:14:51 MainThread: Found 0 impalad/0 statestored/0 catalogd process(es) 15:14:51 MainThread: Starting State Store logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/statestored.INFO 15:14:51 MainThread: Starting Catalog Service logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/catalogd.INFO 15:14:51 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad.INFO 15:14:51 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO 15:14:51 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO 15:14:54 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 15:14:54 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 15:14:54 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-0b8f.vpc.cloudera.com:25000 15:14:54 MainThread: 'backends' 15:14:54 MainThread: Waiting for num_known_live_backends=3. Current value: None 15:14:55 MainThread: Found 2 impalad/1 statestored/1 catalogd process(es) 15:14:55 MainThread: Error starting cluster Traceback (most recent call last): File "/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/bin/start-impala-cluster.py", line 770, in <module> expected_cluster_size - expected_catalog_delays) File "/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/tests/common/impala_cluster.py", line 186, in wait_until_ready early_abort_fn=check_processes_still_running) File "/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/tests/common/impala_service.py", line 284, in wait_for_num_known_live_backends early_abort_fn() File "/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/tests/common/impala_cluster.py", line 178, in check_processes_still_running assert len(self.impalads) >= expected_num_impalads AssertionError DEBUG:impala_cluster:Found 2 impalad/1 statestored/1 catalogd process(es)
Looking into the crashed impalad's log:
I0528 15:14:54.587469 15245 tmp-file-mgr.cc:229] Using scratch directory /tmp/tmpR006lp/impala-scratch on disk 0 limit: 8589934592.00 GB I0528 15:14:54.648952 15245 status.cc:129] Failed to get post-punch file size: Not found: /tmp/tmpR006lp/impala-scratch/88432f73256ff458:c620697eade771bb: No such file or directory (error 2) @ 0x1d5b072 impala::Status::Status() @ 0x264fa09 impala::FileSystemUtil::CheckHolePunch() @ 0x22e6947 impala::TmpFileMgr::InitCustom() @ 0x22e59a4 impala::TmpFileMgr::InitCustom() @ 0x22e58f0 impala::TmpFileMgr::Init() @ 0x248835c impala::ImpalaServer::ImpalaServer() @ 0x2483a34 ImpaladMain() @ 0x1d048af main @ 0x7f23f3a64c04 __libc_start_main @ 0x1d04726 (unknown) E0528 15:14:54.649108 15245 impala-server.cc:394] Failed to get post-punch file size: Not found: /tmp/tmpR006lp/impala-scratch/88432f73256ff458:c620697eade771bb: No such file or directory (error 2) E0528 15:14:54.649127 15245 impala-server.cc:397] Aborting Impala Server startup due to improperly configured scratch directories.. Impalad exiting.
It looks like the scratch dir is not created successfully.
Attachments
Attachments
Issue Links
- is broken by
-
IMPALA-3766 Optionally compress spilled data before writing it to disk
- Resolved
- is duplicated by
-
IMPALA-9710 Flakiness in test_write_error_failover
- Resolved
- relates to
-
IMPALA-2162 Scratch files can be clobbered with multiple impalads per node
- Resolved