Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-1262

Add security/checksum UDFs sha,crc32,md5,aes_encrypt, and aes_decrypt

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 0.6.0
    • Fix Version/s: None
    • Component/s: UDF
    • Labels:
      None

      Description

      Add security/checksum UDFs sha,crc32,md5,aes_encrypt, and aes_decrypt

      1. hive-1262-1.patch.txt
        37 kB
        Edward Capriolo

        Issue Links

          Activity

          Hide
          jvs John Sichi added a comment -

          Review comments added in

          https://reviews.apache.org/r/192/

          Show
          jvs John Sichi added a comment - Review comments added in https://reviews.apache.org/r/192/
          Hide
          ghoward@apache.org Geoff Howard added a comment -

          There is a bug in the implementation of GenericUDFSha in the evaluate method. In the for loop that converts the hashed bytes back out to the string representation the use of Integer.toHexString(0xFF & digested[i]) will miss leading zeroes for hex values less than 0x10. You can see this in the udf_sha.q.out file in the patch. The correct SHA-1 has of "hive rules!" is:
          e0b2715219b30234f0aef56786f81046a366699f but the output of this function is:
          e0b2715219b3234f0aef56786f81046a366699f

          The seventh byte is 0x02, but is output as string "2".

          The typical fix is to force the pad with code as follows:

          Integer.toString((0xFF & digested[i]) + 0x100, 16).substring(1)

          but that creates an extra String object and I prefer the following:

          int j = 0xFF & digested[i];
          if (j < 0x10) hexString.append('0');
          hexString.append(Integer.toHexString(j));

          I can upload a new patch but don't currently have the source code checked out, so I'm hoping someone beats me to it...

          Show
          ghoward@apache.org Geoff Howard added a comment - There is a bug in the implementation of GenericUDFSha in the evaluate method. In the for loop that converts the hashed bytes back out to the string representation the use of Integer.toHexString(0xFF & digested [i] ) will miss leading zeroes for hex values less than 0x10. You can see this in the udf_sha.q.out file in the patch. The correct SHA-1 has of "hive rules!" is: e0b2715219b30234f0aef56786f81046a366699f but the output of this function is: e0b2715219b3234f0aef56786f81046a366699f The seventh byte is 0x02, but is output as string "2". The typical fix is to force the pad with code as follows: Integer.toString((0xFF & digested [i] ) + 0x100, 16).substring(1) but that creates an extra String object and I prefer the following: int j = 0xFF & digested [i] ; if (j < 0x10) hexString.append('0'); hexString.append(Integer.toHexString(j)); I can upload a new patch but don't currently have the source code checked out, so I'm hoping someone beats me to it...
          Hide
          appodictic Edward Capriolo added a comment -

          @Geoff
          This is held back on my end. John added some review board comments to my latest patch and I have been occupied by other things. Since you are interested in using this I will finalize the patch today and it should get committed to trunk soon.

          Show
          appodictic Edward Capriolo added a comment - @Geoff This is held back on my end. John added some review board comments to my latest patch and I have been occupied by other things. Since you are interested in using this I will finalize the patch today and it should get committed to trunk soon.
          Hide
          ghoward@apache.org Geoff Howard added a comment -

          That's great - thanks. I forgot to mention that the same bug will affect the MD5 implementation as well, with the same trivial fix obviously.

          Show
          ghoward@apache.org Geoff Howard added a comment - That's great - thanks. I forgot to mention that the same bug will affect the MD5 implementation as well, with the same trivial fix obviously.
          Hide
          appodictic Edward Capriolo added a comment -

          @Sichi We can not move the

             try {
                digest = java.security.MessageDigest.getInstance("MD5");
              } catch (NoSuchAlgorithmException e) {
                LOG.error(e);
              //  throw new HiveException(e);
              }
          

          Because initialize can not throw a hive exception. I am not sure what you want me to do.

          Show
          appodictic Edward Capriolo added a comment - @Sichi We can not move the try { digest = java.security.MessageDigest.getInstance("MD5"); } catch (NoSuchAlgorithmException e) { LOG.error(e); // throw new HiveException(e); } Because initialize can not throw a hive exception. I am not sure what you want me to do.
          Hide
          jvs John Sichi added a comment -

          Throwing RuntimeException instead should be fine.

          Show
          jvs John Sichi added a comment - Throwing RuntimeException instead should be fine.

            People

            • Assignee:
              appodictic Edward Capriolo
              Reporter:
              appodictic Edward Capriolo
            • Votes:
              4 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development