Details
-
Bug
-
Status: Open
-
Critical
-
Resolution: Unresolved
-
None
-
None
-
None
-
ghx-label-14
Description
Similar to IMPALA-10301, we see a timeout issue in test_local_catalog_ddls_with_invalidate_metadata_sync_ddl but on a AlterTable command:
custom_cluster/test_concurrent_ddls.py:77: in test_local_catalog_ddls_with_invalidate_metadata_sync_ddl
self._run_ddls_with_invalidation(unique_database, sync_ddl=True)
custom_cluster/test_concurrent_ddls.py:146: in _run_ddls_with_invalidation
worker[i].get(timeout=100)
/data/jenkins/workspace/impala-cdw-master-staging-core-s3/Impala-Toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:572: in get
raise self._value
E AssertionError: Query timeout(60s): alter table test_local_catalog_ddls_with_invalidate_metadata_sync_ddl_b87f02d6.test_1_part add partition (j=2)
E assert False
In the catalogd logs, I can see a warning on this query:
W0914 07:51:53.997368 1362 JniUtil.java:160] Response too slow: size=323 (323B), duration=103734ms (1m43s), method: execDdl, request: TDdlExecRequest(protocol_version:V1, header:TCatalogServiceRequestHeader(requesting_user:jenkins, redacted_sql_stmt:alter table test_local_catalog_ddls_with_invalidate_metadata_sync_ddl_b87f02d6.test_1_part add partition (j=2), client_ip:127.0.0.1, want_minimal_response:true), ddl_type:ALTER_TABLE, alter_table_params:TAlterTableParams(alter_type:ADD_PARTITION, table_name:TTableName(db_name:test_local_catalog_ddls_with_invalidate_metadata_sync_ddl_b87f02d6, table_name:test_1_part), add_partition_params:TAlterTableAddPartitionParams(if_not_exists:false, partitions:[TPartitionDef(partition_spec:[TPartitionKeyValue(name:j, value:2)])])), query_options:TDdlQueryOptions(sync_ddl:true, debug_action:, lock_max_wait_time_s:300))
It takes 1m43s at all! The timeout is 60s. We need further investigation on what's going on in catalogd.