Details
-
Bug
-
Status: Open
-
Critical
-
Resolution: Unresolved
-
Impala 3.4.0
-
None
-
None
-
ghx-label-12
Description
I was messing around with running impala in a single-node dockerized configuration and ran into a bunch of weirdness stemming when I restarted the impalad. It got into a state where where was a new and old statestore registration with the same IP/port and different hostnames (since docker generates new hostnames for each incarnation of the container).
I saw a crash in Coordinator::GetRootSink(). The cause of that is the coordinator treating the same impalad as two distinct backends, and sending two execute RPCs to the backend (this is a single node cluster).
I0528 17:32:41.760128 573 coordinator.cc:143] f84b158b036445ad:3a9defdf00000000] Exec() query_id=f84b158b036445ad:3a9defdf00000000 stmt=SELECT COUNT(*) FROM tpcds_kudu.call_center I0528 17:32:41.760670 573 coordinator.cc:463] f84b158b036445ad:3a9defdf00000000] starting execution on 2 backends for query_id=f84b158b036445ad:3a9defdf00000000 .. I0528 17:32:41.762449 78 control-service.cc:153] f84b158b036445ad:3a9defdf00000000] ExecQueryFInstances(): query_id=f84b158b036445ad:3a9defdf00000000 coord=a16ac03fc53b:22000 #instances=1 I0528 17:32:41.761706 79 control-service.cc:153] f84b158b036445ad:3a9defdf00000000] ExecQueryFInstances(): query_id=f84b158b036445ad:3a9defdf00000000 coord=a16ac03fc53b:22000 #instances=4 .. Wrote minidump to /opt/impala/logs/minidumps/impalad/15727084-c931-49e1-62d37e86-75cfe0f6.dmp # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00000000011a0d50, pid=1, tid=0x00007f92b5e8c700 # # JRE version: OpenJDK Runtime Environment (8.0_242-b08) (build 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08) # Java VM: OpenJDK 64-Bit Server VM (25.242-b08 mixed mode linux-amd64 compressed oops) # Problematic frame: Wrote minidump to /opt/impala/logs/minidumps/impalad/15727084-c931-49e1-62d37e86-75cfe0f6.dmp # C [impalad+0xda0d50] impala::FragmentInstanceState::GetRootSink() const+0x0 # # Core dump written. Default location: /opt/impala/core or core.1 # # An error report file with more information is saved as: # /opt/impala/hs_err_pid1.log # # If you would like to submit a bug report, please visit: # http://bugreport.java.com/bugreport/crash.jsp #
CC twm378
At a separate time I saw it trip the "Tried to add existing backend to executor group" case in ExecutorGroup::AddExecutor().
>>void ExecutorGroup::AddExecutor(const BackendDescriptorPB& be_desc) { // be_desc.is_executor can be false for the local backend when scheduling queries to run // on the coordinator host. DCHECK(!be_desc.ip_address().empty()); Executors& be_descs = executor_map_[be_desc.ip_address()]; auto eq = [&be_desc](const BackendDescriptorPB& existing) { // The IP addresses must already match, so it is sufficient to check the port. DCHECK_EQ(existing.ip_address(), be_desc.ip_address()); return existing.address().port() == be_desc.address().port(); }; if (find_if(be_descs.begin(), be_descs.end(), eq) != be_descs.end()) { LOG(DFATAL) << "Tried to add existing backend to executor group: " << be_desc.krpc_address(); return; } if (!CheckConsistencyOrWarn(be_desc)) { LOG(WARNING) << "Ignoring inconsistent backend for executor group: " << be_desc.krpc_address(); return; } if (be_descs.empty()) { executor_ip_hash_ring_.AddNode(be_desc.ip_address()); } be_descs.push_back(be_desc); executor_ip_map_[be_desc.address().hostname()] = be_desc.ip_address(); }
I'm not sure if using the hostname to identify impalads is even useful at this point, we could probably simplify this by using IP address only.
Attachments
Attachments
Issue Links
- is related to
-
IMPALA-9790 Dockerized daemons should set --hostname to the resolved IP
- Resolved