Details
-
Improvement
-
Status: Patch Available
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Two major optimizations can be done:
PIG-1472added multiple data types to store different sizes (byte, short, int). It can be simplified using WritableUtils.writeVInt. There is no difference for byte and short compared to current approach. But with int, it could be beneficial where lot of numbers could be written with 3 bytes instead of 4. For eg: 32768 is written using 3 bytes in with WritableUtils.writeVInt whereas currently 4 bytes (int) is used.- String comparison in BinInterSedesTupleRawComparator initializes String for comparison. Should instead compare bytes like Text.Comparator.
str1 = new String(bb1.array(), bb1.position(), casz1, BinInterSedes.UTF8); str2 = new String(bb2.array(), bb2.position(), casz2, BinInterSedes.UTF8);