Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3080

Slow performance on kerberos cluster - Possible inefficient implementation of TSaslTransport::write

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: Impala 2.5.0
    • Fix Version/s: Impala 2.6.0
    • Component/s: Perf Investigation
    • Labels:
      None

      Description

      Test results for Kerberized and non-Kerberized clusters (CDH5.4.9) -
      Driver version: 2.5.31
      Dataset 11 STRING columns and 50000 rows (same as the one used for CDH 5.0 testing)
      Linux driver:

      Kerberized Impala Non-Kerberized Impala (NOSASL) Kerberized Hive Non-Kerberized Hive (SASL-PLAIN)
      4.232 s 1.88 s 1.6 s 1.413 s

      There were 7694354 bytes sent from Impala VS 5770412 sent from Hive. A closer look at Wireshark/tcpdump trace it looks like from Impala the majority of the TCP packets contain 1,2 and 4 bytes of data whereas for Hive the majority of the TCP packets contain 1460 bytes for data. Impala sends the entire dataset using 33126 TCP packets whereas Hive using 4882 TCP packets.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                mmokhtar Mostafa Mokhtar
                Reporter:
                anujphadke Anuj Phadke
              • Votes:
                1 Vote for this issue
                Watchers:
                12 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: