There is a bug in the implementation of GenericUDFSha in the evaluate method. In the for loop that converts the hashed bytes back out to the string representation the use of Integer.toHexString(0xFF & digested[i]) will miss leading zeroes for hex values less than 0x10. You can see this in the udf_sha.q.out file in the patch. The correct SHA-1 has of "hive rules!" is:
e0b2715219b30234f0aef56786f81046a366699f but the output of this function is:
The seventh byte is 0x02, but is output as string "2".
The typical fix is to force the pad with code as follows:
Integer.toString((0xFF & digested[i]) + 0x100, 16).substring(1)
but that creates an extra String object and I prefer the following:
int j = 0xFF & digested[i];
if (j < 0x10) hexString.append('0');
I can upload a new patch but don't currently have the source code checked out, so I'm hoping someone beats me to it...