Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Duplicate
-
Impala 1.2
-
None
-
None
-
Kerberized 17 node cluster.
Description
This happened during the nightly performance run, which kept failing. It is consistently reproducible. It looks like the catalog services keep failing to connect to the hive metastore. I checked the metastore, and it was (and had been) up. We don't see this problem on an unsecure cluster, so it is likely related to kerberos.
catalogd.INFO
Adding Update: CATALOG:469a9123f6c34bc5:98dea749aed26c97@475 I1009 22:32:30.015769 13542 UserGroupInformation.java:939] Initiating logout for impala/c2104.hal.cloudera.com@HAL17.CLOUDERA.COM I1009 22:32:30.016191 13542 UserGroupInformation.java:949] Initiating re-login for impala/c2104.hal.cloudera.com@HAL17.CLOUDERA.COM I1009 22:33:02.966976 16029 catalog-server.cc:164] Catalog Version: 475 Last Catalog Version: 475 I1009 22:37:18.823482 16029 catalog-server.cc:164] Catalog Version: 475 Last Catalog Version: 475 I1009 22:41:34.178099 16029 catalog-server.cc:164] Catalog Version: 475 Last Catalog Version: 475 I1009 22:44:30.010795 13542 UserGroupInformation.java:939] Initiating logout for impala/c2104.hal.cloudera.com@HAL17.CLOUDERA.COM I1009 22:44:30.011219 13542 UserGroupInformation.java:949] Initiating re-login for impala/c2104.hal.cloudera.com@HAL17.CLOUDERA.COM I1009 22:45:49.561247 16029 catalog-server.cc:164] Catalog Version: 475 Last Catalog Version: 475 I1009 22:49:18.510164 16029 thrift-util.cc:97] TSocket::peek() recv() <Host: 172.20.86.10 Port: 33235>Connection reset by peer I1009 22:49:18.510372 16029 thrift-util.cc:97] TThreadedServer client died: recv(): Connection reset by peer
The error log is attached.