Test results for Kerberized and non-Kerberized clusters (CDH5.4.9) -
Driver version: 2.5.31
Dataset 11 STRING columns and 50000 rows (same as the one used for CDH 5.0 testing)
|Kerberized Impala||Non-Kerberized Impala (NOSASL)||Kerberized Hive||Non-Kerberized Hive (SASL-PLAIN)|
|4.232 s||1.88 s||1.6 s||1.413 s|
There were 7694354 bytes sent from Impala VS 5770412 sent from Hive. A closer look at Wireshark/tcpdump trace it looks like from Impala the majority of the TCP packets contain 1,2 and 4 bytes of data whereas for Hive the majority of the TCP packets contain 1460 bytes for data. Impala sends the entire dataset using 33126 TCP packets whereas Hive using 4882 TCP packets.