Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: Impala 2.0
    • Fix Version/s: Impala 2.1
    • Component/s: None
    • Labels:
      None

      Description

      Fetches of 10M rows of single-column data from a Kerberos-enabled cluster are 3x slower than when Kerberos is turned off.

      The problem appears to be in how long fetch() takes to run. Impala reports high ClientFetchWaitTime values for both cases. The profiles show that the processing time for the actual query is pretty constant (i.e. the amount of time taken to scan the table is constant, for example).

      I instrumented impala-shell to show the fetch() latencies; they're ~3x larger when Kerberos is enabled. I also added a micro-benchmark of 10k PingImpalaService() calls at startup time, note that they are finished nearly 2x faster when Kerberos is disabled.

      With Kerberos

      [root@c2102 henry]# time  impala-shell  -i c2108.hal.cloudera.com -B -k -q "select * from foo"
      Starting Impala Shell using Kerberos authentication
      Using service name 'impala'
      10000 pings in: 3.86384677887s
      Max latency: 0.000801086425781s
      Avg. latency: 0.000386384677887s
      Connected to c2108.hal.cloudera.com:21000
      Server version: impalad version 2.1.0-cdh5-INTERNAL RELEASE (build 441ae18facf3b1f80a30122cc538e95d408d81fa)
      Query: select * from foo
      9769 fetches completed
      **** Total fetch time: 312.591347694s
      **** Max fetch time: 0.0757119655609s
      **** Max fetch wait: 0.0763940811157s
      Fetched 10000001 row(s) in 320.03s
      
      real    5m24.138s
      user    4m36.487s
      sys     0m26.054s
      [root@c2102 henry]#
      

      Without kerberos

      [root@c2102 henry]# time  impala-shell  -i c2108.hal.cloudera.com -B -q "select * from foo"
      Starting Impala Shell without Kerberos authentication
      10000 pings in: 2.09184002876s
      Max latency: 0.000752925872803s
      Avg. latency: 0.000209184002876s
      Connected to c2108.hal.cloudera.com:21000
      Server version: impalad version 2.1.0-cdh5-INTERNAL RELEASE (build 441ae18facf3b1f80a30122cc538e95d408d81fa)
      Query: select * from foo
      9770 fetches completed
      **** Total fetch time: 90.6379804611s
      **** Max fetch time: 0.0160579681396s
      **** Max fetch wait: 0.022047996521s
      Fetched 10000001 row(s) in 99.67s
      
      real    1m41.975s
      user    1m19.318s
      sys     0m0.998s
      

        Attachments

          Activity

            People

            • Assignee:
              mjacobs Matthew Jacobs
              Reporter:
              henryr Henry Robinson
            • Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: