Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3080

Slow performance on kerberos cluster - Possible inefficient implementation of TSaslTransport::write

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: Impala 2.5.0
    • Fix Version/s: Impala 2.6.0
    • Component/s: Perf Investigation
    • Labels:
      None

      Description

      Test results for Kerberized and non-Kerberized clusters (CDH5.4.9) -
      Driver version: 2.5.31
      Dataset 11 STRING columns and 50000 rows (same as the one used for CDH 5.0 testing)
      Linux driver:

      Kerberized Impala Non-Kerberized Impala (NOSASL) Kerberized Hive Non-Kerberized Hive (SASL-PLAIN)
      4.232 s 1.88 s 1.6 s 1.413 s

      There were 7694354 bytes sent from Impala VS 5770412 sent from Hive. A closer look at Wireshark/tcpdump trace it looks like from Impala the majority of the TCP packets contain 1,2 and 4 bytes of data whereas for Hive the majority of the TCP packets contain 1460 bytes for data. Impala sends the entire dataset using 33126 TCP packets whereas Hive using 4882 TCP packets.

        Attachments

        Issue Links

          Activity

            People

            • Assignee:
              mmokhtar Mostafa Mokhtar
              Reporter:
              anujphadke Anuj Phadke

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment