Hive
  1. Hive
  2. HIVE-3246

java primitive type for binary datatype should be byte[]

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.9.0
    • Fix Version/s: 0.10.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change
    • Release Note:
      Hide
      This fix changes the internal representation of binary type within Hive. UDFs which were using binary type and making use of the fact that java representation of binary data in Hive is ByteArrayRef needs to be updated to reflect that now that representation is byte[] instead. Also note that this doesn't change the format for on-disk data.
      Show
      This fix changes the internal representation of binary type within Hive. UDFs which were using binary type and making use of the fact that java representation of binary data in Hive is ByteArrayRef needs to be updated to reflect that now that representation is byte[] instead. Also note that this doesn't change the format for on-disk data.

      Description

      PrimitiveObjectInspector.getPrimitiveJavaObject is supposed to return a java object. But in case of binary datatype, it returns ByteArrayRef (not java standard type). The suitable java object for it would be byte[].

      1. HIVE-3246.1.patch
        28 kB
        Thejas M Nair
      2. HIVE-3246.2.patch
        28 kB
        Thejas M Nair

        Issue Links

          Activity

          Hide
          Ashutosh Chauhan added a comment -

          This issue is fixed and released as part of 0.10.0 release. If you find an issue which seems to be related to this one, please create a new jira and link this one with new jira.

          Show
          Ashutosh Chauhan added a comment - This issue is fixed and released as part of 0.10.0 release. If you find an issue which seems to be related to this one, please create a new jira and link this one with new jira.
          Hide
          Hudson added a comment -

          Integrated in Hive-trunk-hadoop2 #54 (See https://builds.apache.org/job/Hive-trunk-hadoop2/54/)
          HIVE-3246 : java primitive type for binary datatype should be byte[] (Thejas Nair via Ashutosh Chauhan) (Revision 1363427)

          Result = ABORTED
          hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1363427
          Files :

          • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyUtils.java
          • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyBinaryObjectInspector.java
          • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/BinaryObjectInspector.java
          • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/JavaBinaryObjectInspector.java
          • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorConverter.java
          • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java
          • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/SettableBinaryObjectInspector.java
          • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableBinaryObjectInspector.java
          • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableConstantBinaryObjectInspector.java
          • /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/TestStatsSerde.java
          • /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/binarysortable/MyTestClass.java
          • /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/binarysortable/TestBinarySortableSerDe.java
          • /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/columnar/TestLazyBinaryColumnarSerDe.java
          • /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/lazybinary/MyTestClassBigger.java
          • /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/lazybinary/TestLazyBinarySerDe.java
          • /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestStandardObjectInspectors.java
          Show
          Hudson added a comment - Integrated in Hive-trunk-hadoop2 #54 (See https://builds.apache.org/job/Hive-trunk-hadoop2/54/ ) HIVE-3246 : java primitive type for binary datatype should be byte[] (Thejas Nair via Ashutosh Chauhan) (Revision 1363427) Result = ABORTED hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1363427 Files : /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyUtils.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyBinaryObjectInspector.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/BinaryObjectInspector.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/JavaBinaryObjectInspector.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorConverter.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/SettableBinaryObjectInspector.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableBinaryObjectInspector.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableConstantBinaryObjectInspector.java /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/TestStatsSerde.java /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/binarysortable/MyTestClass.java /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/binarysortable/TestBinarySortableSerDe.java /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/columnar/TestLazyBinaryColumnarSerDe.java /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/lazybinary/MyTestClassBigger.java /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/lazybinary/TestLazyBinarySerDe.java /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestStandardObjectInspectors.java
          Hide
          Hudson added a comment -

          Integrated in Hive-trunk-h0.21 #1553 (See https://builds.apache.org/job/Hive-trunk-h0.21/1553/)
          HIVE-3246 : java primitive type for binary datatype should be byte[] (Thejas Nair via Ashutosh Chauhan) (Revision 1363427)

          Result = FAILURE
          hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1363427
          Files :

          • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyUtils.java
          • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyBinaryObjectInspector.java
          • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/BinaryObjectInspector.java
          • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/JavaBinaryObjectInspector.java
          • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorConverter.java
          • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java
          • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/SettableBinaryObjectInspector.java
          • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableBinaryObjectInspector.java
          • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableConstantBinaryObjectInspector.java
          • /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/TestStatsSerde.java
          • /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/binarysortable/MyTestClass.java
          • /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/binarysortable/TestBinarySortableSerDe.java
          • /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/columnar/TestLazyBinaryColumnarSerDe.java
          • /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/lazybinary/MyTestClassBigger.java
          • /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/lazybinary/TestLazyBinarySerDe.java
          • /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestStandardObjectInspectors.java
          Show
          Hudson added a comment - Integrated in Hive-trunk-h0.21 #1553 (See https://builds.apache.org/job/Hive-trunk-h0.21/1553/ ) HIVE-3246 : java primitive type for binary datatype should be byte[] (Thejas Nair via Ashutosh Chauhan) (Revision 1363427) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1363427 Files : /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyUtils.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyBinaryObjectInspector.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/BinaryObjectInspector.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/JavaBinaryObjectInspector.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorConverter.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/SettableBinaryObjectInspector.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableBinaryObjectInspector.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableConstantBinaryObjectInspector.java /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/TestStatsSerde.java /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/binarysortable/MyTestClass.java /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/binarysortable/TestBinarySortableSerDe.java /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/columnar/TestLazyBinaryColumnarSerDe.java /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/lazybinary/MyTestClassBigger.java /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/lazybinary/TestLazyBinarySerDe.java /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestStandardObjectInspectors.java
          Hide
          Ashutosh Chauhan added a comment -

          Committed to trunk. Thanks, Thejas!

          Show
          Ashutosh Chauhan added a comment - Committed to trunk. Thanks, Thejas!
          Hide
          Ashutosh Chauhan added a comment -

          Yeah. I will add the release notes while resolving this ticket.

          Show
          Ashutosh Chauhan added a comment - Yeah. I will add the release notes while resolving this ticket.
          Hide
          Edward Capriolo added a comment -

          We have to do a release note it people who write binary UDFs are going to be effected by the change.

          https://github.com/edwardcapriolo/hive_cassandra_udfs

          It seems like if you wrote a UDF that returns binary you may be effected depending on how you wrote it.

          Show
          Edward Capriolo added a comment - We have to do a release note it people who write binary UDFs are going to be effected by the change. https://github.com/edwardcapriolo/hive_cassandra_udfs It seems like if you wrote a UDF that returns binary you may be effected depending on how you wrote it.
          Hide
          Ashutosh Chauhan added a comment -

          +1. Looks good. Running tests.

          Show
          Ashutosh Chauhan added a comment - +1. Looks good. Running tests.
          Hide
          Travis Crawford added a comment -

          I patched v2 of this patch into a clean trunk and was able to run the query that failed in HIVE-3266. It was a simple "select *" from a table using ThriftDeserializer that has a binary field.

          Show
          Travis Crawford added a comment - I patched v2 of this patch into a clean trunk and was able to run the query that failed in HIVE-3266 . It was a simple "select *" from a table using ThriftDeserializer that has a binary field.
          Hide
          Thejas M Nair added a comment -
          Show
          Thejas M Nair added a comment - Correct RB link - https://reviews.apache.org/r/5943/
          Hide
          Namit Jain added a comment -

          See Ashutosh's comments above

          Show
          Namit Jain added a comment - See Ashutosh's comments above
          Hide
          Ashutosh Chauhan added a comment -

          Thejas M Nair Looks like the RB link is incorrect. Can you take a look?

          Show
          Ashutosh Chauhan added a comment - Thejas M Nair Looks like the RB link is incorrect. Can you take a look?
          Hide
          Thejas M Nair added a comment -

          Patch is ready for review

          Show
          Thejas M Nair added a comment - Patch is ready for review
          Hide
          Thejas M Nair added a comment -

          HIVE-3246.2.patch - restoring comment that was accidentally reverted by previous patch .
          Reviewboard link - https://reviews.apache.org/r/5414/

          Show
          Thejas M Nair added a comment - HIVE-3246 .2.patch - restoring comment that was accidentally reverted by previous patch . Reviewboard link - https://reviews.apache.org/r/5414/
          Hide
          Thejas M Nair added a comment -

          byte [] is not a primitive. I have no issue with the change but there must be a reason this was done. Likely a performance issue.

          Yes, byte[] is not a java primitive type, but it as close as it gets to it (it is a basic type that ships with java). ByteArrayRef is just a wrapper around byte[], I don't see any performance advantage of using it when binary is converted to java primitive type using the getPrimitiveJavaObject call. Fyi, Ashutosh implemented the binary type using ByteArrayRef in HIVE-2380, I believe he does not see a performance issue as well (based on his comment above.)

          Show
          Thejas M Nair added a comment - byte [] is not a primitive. I have no issue with the change but there must be a reason this was done. Likely a performance issue. Yes, byte[] is not a java primitive type, but it as close as it gets to it (it is a basic type that ships with java). ByteArrayRef is just a wrapper around byte[], I don't see any performance advantage of using it when binary is converted to java primitive type using the getPrimitiveJavaObject call. Fyi, Ashutosh implemented the binary type using ByteArrayRef in HIVE-2380 , I believe he does not see a performance issue as well (based on his comment above.)
          Hide
          Edward Capriolo added a comment -

          byte [] is not a primitive. I have no issue with the change but there must be a reason this was done. Likely a performance issue.

          Show
          Edward Capriolo added a comment - byte [] is not a primitive. I have no issue with the change but there must be a reason this was done. Likely a performance issue.
          Hide
          Ashutosh Chauhan added a comment -

          This makes sense. Primitive type should really be primitive. : )

          Show
          Ashutosh Chauhan added a comment - This makes sense. Primitive type should really be primitive. : )
          Hide
          Thejas M Nair added a comment -

          The fix to use byte[] will not be backward compatible, but for the long term I think it is important to fix this. Hopefully, this won't affect too many people as binary datatype is relatively new.

          Show
          Thejas M Nair added a comment - The fix to use byte[] will not be backward compatible, but for the long term I think it is important to fix this. Hopefully, this won't affect too many people as binary datatype is relatively new.

            People

            • Assignee:
              Thejas M Nair
              Reporter:
              Thejas M Nair
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development