Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
Impala 2.0
-
None
-
None
Description
Fetches of 10M rows of single-column data from a Kerberos-enabled cluster are 3x slower than when Kerberos is turned off.
The problem appears to be in how long fetch() takes to run. Impala reports high ClientFetchWaitTime values for both cases. The profiles show that the processing time for the actual query is pretty constant (i.e. the amount of time taken to scan the table is constant, for example).
I instrumented impala-shell to show the fetch() latencies; they're ~3x larger when Kerberos is enabled. I also added a micro-benchmark of 10k PingImpalaService() calls at startup time, note that they are finished nearly 2x faster when Kerberos is disabled.
With Kerberos
[root@c2102 henry]# time impala-shell -i c2108.hal.cloudera.com -B -k -q "select * from foo" Starting Impala Shell using Kerberos authentication Using service name 'impala' 10000 pings in: 3.86384677887s Max latency: 0.000801086425781s Avg. latency: 0.000386384677887s Connected to c2108.hal.cloudera.com:21000 Server version: impalad version 2.1.0-cdh5-INTERNAL RELEASE (build 441ae18facf3b1f80a30122cc538e95d408d81fa) Query: select * from foo 9769 fetches completed **** Total fetch time: 312.591347694s **** Max fetch time: 0.0757119655609s **** Max fetch wait: 0.0763940811157s Fetched 10000001 row(s) in 320.03s real 5m24.138s user 4m36.487s sys 0m26.054s [root@c2102 henry]#
Without kerberos
[root@c2102 henry]# time impala-shell -i c2108.hal.cloudera.com -B -q "select * from foo"
Starting Impala Shell without Kerberos authentication
10000 pings in: 2.09184002876s
Max latency: 0.000752925872803s
Avg. latency: 0.000209184002876s
Connected to c2108.hal.cloudera.com:21000
Server version: impalad version 2.1.0-cdh5-INTERNAL RELEASE (build 441ae18facf3b1f80a30122cc538e95d408d81fa)
Query: select * from foo
9770 fetches completed
**** Total fetch time: 90.6379804611s
**** Max fetch time: 0.0160579681396s
**** Max fetch wait: 0.022047996521s
Fetched 10000001 row(s) in 99.67s
real 1m41.975s
user 1m19.318s
sys 0m0.998s