Hive
  1. Hive
  2. HIVE-956

Add support of columnar binary serde

    Details

    1. HIVE.956.patch.0
      71 kB
      Krishna Kumar
    2. HIVE.956.patch.1
      72 kB
      Krishna Kumar
    3. HIVE.956.patch.2
      73 kB
      Krishna Kumar
    4. HIVE-956v3.patch
      65 kB
      Krishna Kumar
    5. HIVE-956v4.patch
      74 kB
      Krishna Kumar

      Activity

      Hide
      Krishna Kumar added a comment -

      Initial patch for lazy binary columnar serde. Reuses elements of columnar serde and lazy binary serde.

      Show
      Krishna Kumar added a comment - Initial patch for lazy binary columnar serde. Reuses elements of columnar serde and lazy binary serde.
      Hide
      Krishna Kumar added a comment -

      Initial patch for lazy binary columnar serde. Reuses elements of columnar serde and lazy binary serde.

      I used 0-length value for null indicator for all types (columnar uses a special sequence, lazybinary uses a byte for every 8 fields), but this meant that to distinguish empty strings, I need to encode string length as part of the serialized data, which then causes an additional cost for every string. Because of this, the results of this serialization did not impress on a specific dataset I used for testing. Thoughts?

      Show
      Krishna Kumar added a comment - Initial patch for lazy binary columnar serde. Reuses elements of columnar serde and lazy binary serde. I used 0-length value for null indicator for all types (columnar uses a special sequence, lazybinary uses a byte for every 8 fields), but this meant that to distinguish empty strings, I need to encode string length as part of the serialized data, which then causes an additional cost for every string. Because of this, the results of this serialization did not impress on a specific dataset I used for testing. Thoughts?
      Hide
      jiraposter@reviews.apache.org added a comment -

      -----------------------------------------------------------
      This is an automatically generated e-mail. To reply, visit:
      https://reviews.apache.org/r/806/
      -----------------------------------------------------------

      Review request for hive and Yongqiang He.

      Summary
      -------

      Add LazyBinaryColumnarSerDe

      This addresses bug HIVE-956.
      https://issues.apache.org/jira/browse/HIVE-956

      Diffs


      ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 77a6dc6
      serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStruct.java b062460
      serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStructBase.java PRE-CREATION
      serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarSerDe.java PRE-CREATION
      serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarStruct.java PRE-CREATION
      serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java e927547
      serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyObject.java 2e2896c
      serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyObjectBase.java PRE-CREATION
      serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryFactory.java 1440472
      serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryObject.java ea20b34
      serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinarySerDe.java 5e6bb0a
      serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ColumnarStructObjectInspector.java 66f4f8d
      serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/CrossMapEqualComparer.java PRE-CREATION
      serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/MapEqualComparer.java PRE-CREATION
      serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorFactory.java 90561a1
      serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java 2b77072
      serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/SimpleMapEqualComparer.java PRE-CREATION
      serde/src/test/org/apache/hadoop/hive/serde2/columnar/TestLazyBinaryColumnarSerDe.java PRE-CREATION

      Diff: https://reviews.apache.org/r/806/diff

      Testing
      -------

      Tests added

      Thanks,

      Krishna

      Show
      jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/806/ ----------------------------------------------------------- Review request for hive and Yongqiang He. Summary ------- Add LazyBinaryColumnarSerDe This addresses bug HIVE-956 . https://issues.apache.org/jira/browse/HIVE-956 Diffs ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 77a6dc6 serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStruct.java b062460 serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStructBase.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarSerDe.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarStruct.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java e927547 serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyObject.java 2e2896c serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyObjectBase.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryFactory.java 1440472 serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryObject.java ea20b34 serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinarySerDe.java 5e6bb0a serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ColumnarStructObjectInspector.java 66f4f8d serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/CrossMapEqualComparer.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/MapEqualComparer.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorFactory.java 90561a1 serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java 2b77072 serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/SimpleMapEqualComparer.java PRE-CREATION serde/src/test/org/apache/hadoop/hive/serde2/columnar/TestLazyBinaryColumnarSerDe.java PRE-CREATION Diff: https://reviews.apache.org/r/806/diff Testing ------- Tests added Thanks, Krishna
      Hide
      Krishna Kumar added a comment -
      Show
      Krishna Kumar added a comment - Review board: https://reviews.apache.org/r/806/
      Hide
      Krishna Kumar added a comment -

      Patch with a change on how null vs empty strings are handled. This patch uses a special single-byte marker (which is not a valid utf-8 byte sequence) to mark empty strings. With this patch, this serde is better than the default serde

      • for all primitive types
      • also when null values preponderate
        while performing about the same for complex structures such as list of structs et al. Please review.
      Show
      Krishna Kumar added a comment - Patch with a change on how null vs empty strings are handled. This patch uses a special single-byte marker (which is not a valid utf-8 byte sequence) to mark empty strings. With this patch, this serde is better than the default serde for all primitive types also when null values preponderate while performing about the same for complex structures such as list of structs et al. Please review.
      Hide
      jiraposter@reviews.apache.org added a comment -

      -----------------------------------------------------------
      This is an automatically generated e-mail. To reply, visit:
      https://reviews.apache.org/r/806/
      -----------------------------------------------------------

      (Updated 2011-06-02 12:00:23.653491)

      Review request for hive and Yongqiang He.

      Changes
      -------

      Uses a special marker for empty strings, thereby incurring no additional cost for normal (non-null, non-empty) strings.

      Summary
      -------

      Add LazyBinaryColumnarSerDe

      This addresses bug HIVE-956.
      https://issues.apache.org/jira/browse/HIVE-956

      Diffs (updated)


      ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 77a6dc6
      serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStruct.java b062460
      serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStructBase.java PRE-CREATION
      serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarSerDe.java PRE-CREATION
      serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarStruct.java PRE-CREATION
      serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java e927547
      serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyObject.java 2e2896c
      serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyObjectBase.java PRE-CREATION
      serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryFactory.java 1440472
      serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryObject.java ea20b34
      serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinarySerDe.java 5e6bb0a
      serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ColumnarStructObjectInspector.java 66f4f8d
      serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/CrossMapEqualComparer.java PRE-CREATION
      serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/MapEqualComparer.java PRE-CREATION
      serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorFactory.java 90561a1
      serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java 2b77072
      serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/SimpleMapEqualComparer.java PRE-CREATION
      serde/src/test/org/apache/hadoop/hive/serde2/columnar/TestLazyBinaryColumnarSerDe.java PRE-CREATION

      Diff: https://reviews.apache.org/r/806/diff

      Testing
      -------

      Tests added

      Thanks,

      Krishna

      Show
      jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/806/ ----------------------------------------------------------- (Updated 2011-06-02 12:00:23.653491) Review request for hive and Yongqiang He. Changes ------- Uses a special marker for empty strings, thereby incurring no additional cost for normal (non-null, non-empty) strings. Summary ------- Add LazyBinaryColumnarSerDe This addresses bug HIVE-956 . https://issues.apache.org/jira/browse/HIVE-956 Diffs (updated) ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 77a6dc6 serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStruct.java b062460 serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStructBase.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarSerDe.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarStruct.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java e927547 serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyObject.java 2e2896c serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyObjectBase.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryFactory.java 1440472 serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryObject.java ea20b34 serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinarySerDe.java 5e6bb0a serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ColumnarStructObjectInspector.java 66f4f8d serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/CrossMapEqualComparer.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/MapEqualComparer.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorFactory.java 90561a1 serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java 2b77072 serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/SimpleMapEqualComparer.java PRE-CREATION serde/src/test/org/apache/hadoop/hive/serde2/columnar/TestLazyBinaryColumnarSerDe.java PRE-CREATION Diff: https://reviews.apache.org/r/806/diff Testing ------- Tests added Thanks, Krishna
      Hide
      He Yongqiang added a comment -

      will take a look.

      Show
      He Yongqiang added a comment - will take a look.
      Hide
      He Yongqiang added a comment -

      i got one conflict when trying to apply this patch to my local. Can you rebase?

      a few minor comments:
      1. is getLength() only for null check? if yes, can you call it 'isNull()'? And if the only difference in ColumnarStruct and BinaryColumnarStruct is null check, just curious, how difficult is it to avoid this new BinaryColumnarStruct class?
      2. can you add a 'toString()' for new binary columnar serde, just the same as columnar serde
      3. why do you need to handle empty string specially? "serializeStream.write(INVALID_UTF__SINGLE_BYTE, 0, 1);" i thought for empty data, we just store data length 0 in rcfile.
      I thought since there is no NULLSequence in the new serde, the null should be handled specially. i am missing sth, How do you handle null here?
      4. can LazyBinaryColumnarSerDe share some code with LazyBinarySerde?
      5. how warnedOnceNullMapKey is used?
      originally the map comparison is not supported, but this patch added a mapEqualComparer. can we put this in a separate patch? It seems the logic in CrossMapEqualComparer is not correct. (how do you make sure you will get the keys from a map in some kind of same order?)

      Thanks for this great work! Pls resubmit the patch after rebase

      Show
      He Yongqiang added a comment - i got one conflict when trying to apply this patch to my local. Can you rebase? a few minor comments: 1. is getLength() only for null check? if yes, can you call it 'isNull()'? And if the only difference in ColumnarStruct and BinaryColumnarStruct is null check, just curious, how difficult is it to avoid this new BinaryColumnarStruct class? 2. can you add a 'toString()' for new binary columnar serde, just the same as columnar serde 3. why do you need to handle empty string specially? "serializeStream.write(INVALID_UTF__SINGLE_BYTE, 0, 1);" i thought for empty data, we just store data length 0 in rcfile. I thought since there is no NULLSequence in the new serde, the null should be handled specially. i am missing sth, How do you handle null here? 4. can LazyBinaryColumnarSerDe share some code with LazyBinarySerde? 5. how warnedOnceNullMapKey is used? originally the map comparison is not supported, but this patch added a mapEqualComparer. can we put this in a separate patch? It seems the logic in CrossMapEqualComparer is not correct. (how do you make sure you will get the keys from a map in some kind of same order?) Thanks for this great work! Pls resubmit the patch after rebase
      Hide
      jiraposter@reviews.apache.org added a comment -

      -----------------------------------------------------------
      This is an automatically generated e-mail. To reply, visit:
      https://reviews.apache.org/r/806/
      -----------------------------------------------------------

      (Updated 2011-06-08 16:04:08.811137)

      Review request for hive and Yongqiang He.

      Changes
      -------

      Updating review comments re toString()

      Summary
      -------

      Add LazyBinaryColumnarSerDe

      This addresses bug HIVE-956.
      https://issues.apache.org/jira/browse/HIVE-956

      Diffs (updated)


      ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 77a6dc6
      serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStruct.java e79021d
      serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStructBase.java PRE-CREATION
      serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarSerDe.java PRE-CREATION
      serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarStruct.java PRE-CREATION
      serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java e927547
      serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyObject.java 2e2896c
      serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyObjectBase.java PRE-CREATION
      serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryFactory.java 1440472
      serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryObject.java ea20b34
      serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinarySerDe.java 4285ab3
      serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ColumnarStructObjectInspector.java 66f4f8d
      serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/CrossMapEqualComparer.java PRE-CREATION
      serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/MapEqualComparer.java PRE-CREATION
      serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorFactory.java 90561a1
      serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java 2b77072
      serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/SimpleMapEqualComparer.java PRE-CREATION
      serde/src/test/org/apache/hadoop/hive/serde2/columnar/TestLazyBinaryColumnarSerDe.java PRE-CREATION

      Diff: https://reviews.apache.org/r/806/diff

      Testing
      -------

      Tests added

      Thanks,

      Krishna

      Show
      jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/806/ ----------------------------------------------------------- (Updated 2011-06-08 16:04:08.811137) Review request for hive and Yongqiang He. Changes ------- Updating review comments re toString() Summary ------- Add LazyBinaryColumnarSerDe This addresses bug HIVE-956 . https://issues.apache.org/jira/browse/HIVE-956 Diffs (updated) ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 77a6dc6 serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStruct.java e79021d serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStructBase.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarSerDe.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarStruct.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java e927547 serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyObject.java 2e2896c serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyObjectBase.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryFactory.java 1440472 serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryObject.java ea20b34 serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinarySerDe.java 4285ab3 serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ColumnarStructObjectInspector.java 66f4f8d serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/CrossMapEqualComparer.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/MapEqualComparer.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorFactory.java 90561a1 serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java 2b77072 serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/SimpleMapEqualComparer.java PRE-CREATION serde/src/test/org/apache/hadoop/hive/serde2/columnar/TestLazyBinaryColumnarSerDe.java PRE-CREATION Diff: https://reviews.apache.org/r/806/diff Testing ------- Tests added Thanks, Krishna
      Hide
      Krishna Kumar added a comment -

      Re-ordering my responses;

      can LazyBinaryColumnarSerDe share some code with LazyBinarySerde?

      Yes, it does. For instance, the serialization of LazyBinaryColumnarSerde forwards to LazyBinarySerde.serialize for each field (except for one special case of empty string described below). Similarly the deserialization happens in both cases eventually via LazyBinaryXXX.init. The objectinspector is the same, while the common parts of the object class (ColumnarStruct and BinaryColumnarStruct) has been refactored into a ColumnarStructBase class.

      how warnedOnceNullMapKey is used?

      To enable the above (reuse of LazyBinarySerde.serialize()), I have made it a static method of LazyBinarySerde. The only member variable which was used in that method was a boolean which was used to issue a LOG warning the first time a null map key is encountered. So, I made that into a parameter so that the existing behaviour for LazyBinarySerde is unchanged (that is a warning is issued once per instance).

      why do you need to handle empty string specially? "serializeStream.write(INVALID_UTF__SINGLE_BYTE, 0, 1);" i thought for empty data, we just store data length 0 in rcfile.

      I thought since there is no NULLSequence in the new serde, the null should be handled specially. i am missing sth, How do you handle null here?

      As you know, column values' length are stored in the key part of rcfile (after run-length encoding, and an optional compressing). A 0 in this recorded length is used as the null indicator. This means that non-null values should occupy one or more bytes when serialized. That was ok with the original LazyBinarySerde.serialize, as primitive numeric types, strings (with their datalength) and complex types (with their datalength) do occupy non-zero bytes. But this is too much redundancy and overhead for the typical case (non-empty strings), so I added an extra parameter "skipLengthPrefix" which skips prefixing string/list/map/struct types with a length prefix. But with this, empty strings become a problem since they need to differentiated from nulls. So I used this special single-byte marker for denoting empty strings. (As a side note, for completeness' sake, I should point out that an instance of a struct which has no fields will be encoded with zero bytes. But this is not allowed by the language so I think we are fine here.)

      is getLength() only for null check? if yes, can you call it 'isNull()'? And if the only difference in ColumnarStruct and BinaryColumnarStruct is null check, just curious, how difficult is it to avoid this new BinaryColumnarStruct class?

      See above. In general, the length recorded in the key part of rcfile reflects the length of the bytesequence with which the lazyobject should be initialized. The only exception is in the case of empty strings, where the recorded length is 1 (the special marker), but the lazyobject needs be initialized with a 0-length byte sequence.

      Recorded length being 0 indicates nulls for lazybinarycolumnar and data being the nullsequence indicates null for lazybinary. The difference between ColumnarStruct and BinaryColumnarStruct is this length/null handling, and the object creation itself, which are now the abstract methods of the common base class.

      originally the map comparison is not supported, but this patch added a mapEqualComparer. can we put this in a separate patch? It seems the logic in CrossMapEqualComparer is not correct. (how do you make sure you will get the keys from a map in some kind of same order?)

      I put this in this patch since I needed that for the tests that I had added. Do you think I should create a dependant jira and extract this part of the patch to that jira?

      Hmm, the logic in crossmapequalcomparer looks ok to me (given the caveats mentioned in the javadoc about broken transitivity of greater-than/less-than.) I am not accessing the keys in tandem, but in a nested loop. Since the number of keys are the same, and the keys are unique, both keys and values matching (as declared by ObjectInspectorUtils.compare) is taken as a match for that pair of key-value pairs.

      can you add a 'toString()' for new binary columnar serde, just the same as columnar serde

      Done, and patch regenerated after rebasing.

      Show
      Krishna Kumar added a comment - Re-ordering my responses; can LazyBinaryColumnarSerDe share some code with LazyBinarySerde? Yes, it does. For instance, the serialization of LazyBinaryColumnarSerde forwards to LazyBinarySerde.serialize for each field (except for one special case of empty string described below). Similarly the deserialization happens in both cases eventually via LazyBinaryXXX.init. The objectinspector is the same, while the common parts of the object class (ColumnarStruct and BinaryColumnarStruct) has been refactored into a ColumnarStructBase class. how warnedOnceNullMapKey is used? To enable the above (reuse of LazyBinarySerde.serialize()), I have made it a static method of LazyBinarySerde. The only member variable which was used in that method was a boolean which was used to issue a LOG warning the first time a null map key is encountered. So, I made that into a parameter so that the existing behaviour for LazyBinarySerde is unchanged (that is a warning is issued once per instance). why do you need to handle empty string specially? "serializeStream.write(INVALID_UTF__SINGLE_BYTE, 0, 1);" i thought for empty data, we just store data length 0 in rcfile. I thought since there is no NULLSequence in the new serde, the null should be handled specially. i am missing sth, How do you handle null here? As you know, column values' length are stored in the key part of rcfile (after run-length encoding, and an optional compressing). A 0 in this recorded length is used as the null indicator. This means that non-null values should occupy one or more bytes when serialized. That was ok with the original LazyBinarySerde.serialize, as primitive numeric types, strings (with their datalength) and complex types (with their datalength) do occupy non-zero bytes. But this is too much redundancy and overhead for the typical case (non-empty strings), so I added an extra parameter "skipLengthPrefix" which skips prefixing string/list/map/struct types with a length prefix. But with this, empty strings become a problem since they need to differentiated from nulls. So I used this special single-byte marker for denoting empty strings. (As a side note, for completeness' sake, I should point out that an instance of a struct which has no fields will be encoded with zero bytes. But this is not allowed by the language so I think we are fine here.) is getLength() only for null check? if yes, can you call it 'isNull()'? And if the only difference in ColumnarStruct and BinaryColumnarStruct is null check, just curious, how difficult is it to avoid this new BinaryColumnarStruct class? See above. In general, the length recorded in the key part of rcfile reflects the length of the bytesequence with which the lazyobject should be initialized. The only exception is in the case of empty strings, where the recorded length is 1 (the special marker), but the lazyobject needs be initialized with a 0-length byte sequence. Recorded length being 0 indicates nulls for lazybinarycolumnar and data being the nullsequence indicates null for lazybinary. The difference between ColumnarStruct and BinaryColumnarStruct is this length/null handling, and the object creation itself, which are now the abstract methods of the common base class. originally the map comparison is not supported, but this patch added a mapEqualComparer. can we put this in a separate patch? It seems the logic in CrossMapEqualComparer is not correct. (how do you make sure you will get the keys from a map in some kind of same order?) I put this in this patch since I needed that for the tests that I had added. Do you think I should create a dependant jira and extract this part of the patch to that jira? Hmm, the logic in crossmapequalcomparer looks ok to me (given the caveats mentioned in the javadoc about broken transitivity of greater-than/less-than.) I am not accessing the keys in tandem, but in a nested loop. Since the number of keys are the same, and the keys are unique, both keys and values matching (as declared by ObjectInspectorUtils.compare) is taken as a match for that pair of key-value pairs. can you add a 'toString()' for new binary columnar serde, just the same as columnar serde Done, and patch regenerated after rebasing.
      Hide
      Krishna Kumar added a comment -

      Re-ordering my responses;

      can LazyBinaryColumnarSerDe share some code with LazyBinarySerde?

      Yes, it does. For instance, the serialization of LazyBinaryColumnarSerde forwards to LazyBinarySerde.serialize for each field (except for one special case of empty string described below). Similarly the deserialization happens in both cases eventually via LazyBinaryXXX.init. The objectinspector is the same, while the common parts of the object class (ColumnarStruct and BinaryColumnarStruct) has been refactored into a ColumnarStructBase class.

      how warnedOnceNullMapKey is used?

      To enable the above (reuse of LazyBinarySerde.serialize()), I have made it a static method of LazyBinarySerde. The only member variable which was used in that method was a boolean which was used to issue a LOG warning the first time a null map key is encountered. So, I made that into a parameter so that the existing behaviour for LazyBinarySerde is unchanged (that is a warning is issued once per instance).

      why do you need to handle empty string specially? "serializeStream.write(INVALID_UTF__SINGLE_BYTE, 0, 1);" i thought for empty data, we just store data length 0 in rcfile.

      I thought since there is no NULLSequence in the new serde, the null should be handled specially. i am missing sth, How do you handle null here?

      As you know, column values' length are stored in the key part of rcfile (after run-length encoding, and an optional compressing). A 0 in this recorded length is used as the null indicator. This means that non-null values should occupy one or more bytes when serialized. That was ok with the original LazyBinarySerde.serialize, as primitive numeric types, strings (with their datalength) and complex types (with their datalength) do occupy non-zero bytes. But this is too much redundancy and overhead for the typical case (non-empty strings), so I added an extra parameter "skipLengthPrefix" which skips prefixing string/list/map/struct types with a length prefix. But with this, empty strings become a problem since they need to differentiated from nulls. So I used this special single-byte marker for denoting empty strings. (As a side note, for completeness' sake, I should point out that an instance of a struct which has no fields will be encoded with zero bytes. But this is not allowed by the language so I think we are fine here.)

      is getLength() only for null check? if yes, can you call it 'isNull()'? And if the only difference in ColumnarStruct and BinaryColumnarStruct is null check, just curious, how difficult is it to avoid this new BinaryColumnarStruct class?

      See above. In general, the length recorded in the key part of rcfile reflects the length of the bytesequence with which the lazyobject should be initialized. The only exception is in the case of empty strings, where the recorded length is 1 (the special marker), but the lazyobject needs be initialized with a 0-length byte sequence.

      Recorded length being 0 indicates nulls for lazybinarycolumnar and data being the nullsequence indicates null for lazybinary. The difference between ColumnarStruct and BinaryColumnarStruct is this length/null handling, and the object creation itself, which are now the abstract methods of the common base class.

      originally the map comparison is not supported, but this patch added a mapEqualComparer. can we put this in a separate patch? It seems the logic in CrossMapEqualComparer is not correct. (how do you make sure you will get the keys from a map in some kind of same order?)

      I put this in this patch since I needed that for the tests that I had added. Do you think I should create a dependant jira and extract this part of the patch to that jira?

      Hmm, the logic in crossmapequalcomparer looks ok to me (given the caveats mentioned in the javadoc about broken transitivity of greater-than/less-than.) I am not accessing the keys in tandem, but in a nested loop. Since the number of keys are the same, and the keys are unique, both keys and values matching (as declared by ObjectInspectorUtils.compare) is taken as a match for that pair of key-value pairs.

      can you add a 'toString()' for new binary columnar serde, just the same as columnar serde

      Done, and patch regenerated after rebasing.

      Show
      Krishna Kumar added a comment - Re-ordering my responses; can LazyBinaryColumnarSerDe share some code with LazyBinarySerde? Yes, it does. For instance, the serialization of LazyBinaryColumnarSerde forwards to LazyBinarySerde.serialize for each field (except for one special case of empty string described below). Similarly the deserialization happens in both cases eventually via LazyBinaryXXX.init. The objectinspector is the same, while the common parts of the object class (ColumnarStruct and BinaryColumnarStruct) has been refactored into a ColumnarStructBase class. how warnedOnceNullMapKey is used? To enable the above (reuse of LazyBinarySerde.serialize()), I have made it a static method of LazyBinarySerde. The only member variable which was used in that method was a boolean which was used to issue a LOG warning the first time a null map key is encountered. So, I made that into a parameter so that the existing behaviour for LazyBinarySerde is unchanged (that is a warning is issued once per instance). why do you need to handle empty string specially? "serializeStream.write(INVALID_UTF__SINGLE_BYTE, 0, 1);" i thought for empty data, we just store data length 0 in rcfile. I thought since there is no NULLSequence in the new serde, the null should be handled specially. i am missing sth, How do you handle null here? As you know, column values' length are stored in the key part of rcfile (after run-length encoding, and an optional compressing). A 0 in this recorded length is used as the null indicator. This means that non-null values should occupy one or more bytes when serialized. That was ok with the original LazyBinarySerde.serialize, as primitive numeric types, strings (with their datalength) and complex types (with their datalength) do occupy non-zero bytes. But this is too much redundancy and overhead for the typical case (non-empty strings), so I added an extra parameter "skipLengthPrefix" which skips prefixing string/list/map/struct types with a length prefix. But with this, empty strings become a problem since they need to differentiated from nulls. So I used this special single-byte marker for denoting empty strings. (As a side note, for completeness' sake, I should point out that an instance of a struct which has no fields will be encoded with zero bytes. But this is not allowed by the language so I think we are fine here.) is getLength() only for null check? if yes, can you call it 'isNull()'? And if the only difference in ColumnarStruct and BinaryColumnarStruct is null check, just curious, how difficult is it to avoid this new BinaryColumnarStruct class? See above. In general, the length recorded in the key part of rcfile reflects the length of the bytesequence with which the lazyobject should be initialized. The only exception is in the case of empty strings, where the recorded length is 1 (the special marker), but the lazyobject needs be initialized with a 0-length byte sequence. Recorded length being 0 indicates nulls for lazybinarycolumnar and data being the nullsequence indicates null for lazybinary. The difference between ColumnarStruct and BinaryColumnarStruct is this length/null handling, and the object creation itself, which are now the abstract methods of the common base class. originally the map comparison is not supported, but this patch added a mapEqualComparer. can we put this in a separate patch? It seems the logic in CrossMapEqualComparer is not correct. (how do you make sure you will get the keys from a map in some kind of same order?) I put this in this patch since I needed that for the tests that I had added. Do you think I should create a dependant jira and extract this part of the patch to that jira? Hmm, the logic in crossmapequalcomparer looks ok to me (given the caveats mentioned in the javadoc about broken transitivity of greater-than/less-than.) I am not accessing the keys in tandem, but in a nested loop. Since the number of keys are the same, and the keys are unique, both keys and values matching (as declared by ObjectInspectorUtils.compare) is taken as a match for that pair of key-value pairs. can you add a 'toString()' for new binary columnar serde, just the same as columnar serde Done, and patch regenerated after rebasing.
      Hide
      He Yongqiang added a comment -

      Maybe i can be more specific,

      LazyBinaryColumnarSerDe and LazyBinarySerde share a lot of code when doing serialization. can warnedOnceNullMapKey be removed?

      >>A 0 in this recorded length is used as the null indicator.
      A 0 should mean an empty string. '\N' means null in Hive. Can you take a look at how LazyBinarySerde handles null value, and do the same thing here?

      I prefer to remove getLength(), it will make the code more clean. In the implementation, it is actually only checks for primitives, which actually can produce wrong results.

      >>Do you think I should create a dependant jira and extract this part of the patch to that jira?
      Yes, pls do that.

      Show
      He Yongqiang added a comment - Maybe i can be more specific, LazyBinaryColumnarSerDe and LazyBinarySerde share a lot of code when doing serialization. can warnedOnceNullMapKey be removed? >>A 0 in this recorded length is used as the null indicator. A 0 should mean an empty string. '\N' means null in Hive. Can you take a look at how LazyBinarySerde handles null value, and do the same thing here? I prefer to remove getLength(), it will make the code more clean. In the implementation, it is actually only checks for primitives, which actually can produce wrong results. >>Do you think I should create a dependant jira and extract this part of the patch to that jira? Yes, pls do that.
      Hide
      Krishna Kumar added a comment -

      can warnedOnceNullMapKey be removed?

      It is easy to remove warnedOnceNullMapKey

      • if it is ok to log a warning message every time a null map key is encountered
      • if it is ok to log a warning message only once per process execution (by making it a class static)

      The current behavior is to log a warning message once per instance of LazyBinarySerde. If we want to retain the same behavior, it should either be a parameter? (or more complicated mechanisms as callback/thread-local)

      A 0 should mean an empty string. '\N' means null in Hive. Can you take a look at how LazyBinarySerde handles null value, and do the same thing here.

      Not sure I understand. The serde is free to implement the mechanism to encode null/empty values anyway it sees fit? '\N' means null only in the context of specific serde - for instance columnar serde. Lazybinaryserde uses a null byte for every 8 fields to encode nulls, (and a string length as part of the data for encoding empty strings). IMO, neither of these options is best suited for lazybinarycolumnar, the former as it means escaping complexities, and the latter as the storage is now by columns, not by rows. I have taken the approach that a 0-length column cell value indicates nulls (nulls being a very common case, should have minimal overheads.). For empty strings, while the option to encode string length as part of the cell value is still an option, I think that is too much overhead (as shown in my tests for the same specific dataset) for the non-empty cells.

      The implementation is fine, I think. It first checks whether the field is a primitive (for non-primitives, input byte stream length is also the data length), and then on the field is a string of length 1 with the value being the special marker etc.

      will do the mapequalcomparer splitting.

      Show
      Krishna Kumar added a comment - can warnedOnceNullMapKey be removed? It is easy to remove warnedOnceNullMapKey if it is ok to log a warning message every time a null map key is encountered if it is ok to log a warning message only once per process execution (by making it a class static) The current behavior is to log a warning message once per instance of LazyBinarySerde. If we want to retain the same behavior, it should either be a parameter? (or more complicated mechanisms as callback/thread-local) A 0 should mean an empty string. '\N' means null in Hive. Can you take a look at how LazyBinarySerde handles null value, and do the same thing here. Not sure I understand. The serde is free to implement the mechanism to encode null/empty values anyway it sees fit? '\N' means null only in the context of specific serde - for instance columnar serde. Lazybinaryserde uses a null byte for every 8 fields to encode nulls, (and a string length as part of the data for encoding empty strings). IMO, neither of these options is best suited for lazybinarycolumnar, the former as it means escaping complexities, and the latter as the storage is now by columns, not by rows. I have taken the approach that a 0-length column cell value indicates nulls (nulls being a very common case, should have minimal overheads.). For empty strings, while the option to encode string length as part of the cell value is still an option, I think that is too much overhead (as shown in my tests for the same specific dataset) for the non-empty cells. The implementation is fine, I think. It first checks whether the field is a primitive (for non-primitives, input byte stream length is also the data length), and then on the field is a string of length 1 with the value being the special marker etc. will do the mapequalcomparer splitting.
      Hide
      Franklin Hu added a comment -

      I've been doing some testing with this patch and it seems that tables using LazyBinaryColumnarSerDe are not outputting the rawDataSize stat that HIVE-2185 added (setting that field to 0).

      https://issues.apache.org/jira/browse/HIVE-2185

      Show
      Franklin Hu added a comment - I've been doing some testing with this patch and it seems that tables using LazyBinaryColumnarSerDe are not outputting the rawDataSize stat that HIVE-2185 added (setting that field to 0). https://issues.apache.org/jira/browse/HIVE-2185
      Hide
      jiraposter@reviews.apache.org added a comment -

      -----------------------------------------------------------
      This is an automatically generated e-mail. To reply, visit:
      https://reviews.apache.org/r/806/
      -----------------------------------------------------------

      (Updated 2011-06-20 12:56:38.943799)

      Review request for hive and Yongqiang He.

      Changes
      -------

      After separating out mapcomparer changes to its own patch

      Summary
      -------

      Add LazyBinaryColumnarSerDe

      This addresses bug HIVE-956.
      https://issues.apache.org/jira/browse/HIVE-956

      Diffs (updated)


      ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 77a6dc6
      serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStruct.java e79021d
      serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStructBase.java PRE-CREATION
      serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarSerDe.java PRE-CREATION
      serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarStruct.java PRE-CREATION
      serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java e927547
      serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyObject.java 2e2896c
      serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyObjectBase.java PRE-CREATION
      serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryFactory.java 1440472
      serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryObject.java ea20b34
      serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinarySerDe.java 4285ab3
      serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ColumnarStructObjectInspector.java 66f4f8d
      serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorFactory.java 90561a1
      serde/src/test/org/apache/hadoop/hive/serde2/columnar/TestLazyBinaryColumnarSerDe.java PRE-CREATION

      Diff: https://reviews.apache.org/r/806/diff

      Testing
      -------

      Tests added

      Thanks,

      Krishna

      Show
      jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/806/ ----------------------------------------------------------- (Updated 2011-06-20 12:56:38.943799) Review request for hive and Yongqiang He. Changes ------- After separating out mapcomparer changes to its own patch Summary ------- Add LazyBinaryColumnarSerDe This addresses bug HIVE-956 . https://issues.apache.org/jira/browse/HIVE-956 Diffs (updated) ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 77a6dc6 serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStruct.java e79021d serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStructBase.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarSerDe.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarStruct.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java e927547 serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyObject.java 2e2896c serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyObjectBase.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryFactory.java 1440472 serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryObject.java ea20b34 serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinarySerDe.java 4285ab3 serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ColumnarStructObjectInspector.java 66f4f8d serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorFactory.java 90561a1 serde/src/test/org/apache/hadoop/hive/serde2/columnar/TestLazyBinaryColumnarSerDe.java PRE-CREATION Diff: https://reviews.apache.org/r/806/diff Testing ------- Tests added Thanks, Krishna
      Hide
      Krishna Kumar added a comment -

      After separating mapcomparer changes to its own jira/patch - HIVE-2209

      Show
      Krishna Kumar added a comment - After separating mapcomparer changes to its own jira/patch - HIVE-2209
      Hide
      Krishna Kumar added a comment -

      I have tested the setting of rawDataSize with a "ANALYZE TABLE COMPUTE STATISTICS"/"DESCRIBE FORMATTED" set of commands with this patch, and it works. Can you please check it again? Please note that you need to apply the patch from HIVE-2209 before applying the patch from this one, since this jira is dependent on that one.

      Show
      Krishna Kumar added a comment - I have tested the setting of rawDataSize with a "ANALYZE TABLE COMPUTE STATISTICS"/"DESCRIBE FORMATTED" set of commands with this patch, and it works. Can you please check it again? Please note that you need to apply the patch from HIVE-2209 before applying the patch from this one, since this jira is dependent on that one.
      Hide
      Franklin Hu added a comment -

      @Krishna looks like it works now with some simple tests

      Show
      Franklin Hu added a comment - @Krishna looks like it works now with some simple tests
      Hide
      He Yongqiang added a comment -

      The patch looks good. One minor comments about LazyBinaryColumnarSerDe is that could some code be moved to some utils or some base class to share with ColumnarSerDe and/or LazyBinarySerde?

      Can you confirm that it is safe to remove 'nullSequence' from ColumnarStructObjectInspector and ObjectInspectorFactory?

      Show
      He Yongqiang added a comment - The patch looks good. One minor comments about LazyBinaryColumnarSerDe is that could some code be moved to some utils or some base class to share with ColumnarSerDe and/or LazyBinarySerde? Can you confirm that it is safe to remove 'nullSequence' from ColumnarStructObjectInspector and ObjectInspectorFactory?
      Hide
      Krishna Kumar added a comment -

      Please note that HIVE-2209 is a pre-requisite for this patch.

      Show
      Krishna Kumar added a comment - Please note that HIVE-2209 is a pre-requisite for this patch.
      Hide
      Krishna Kumar added a comment -

      Common code moved to a base class to that columnarserde and LazyBinaryColumnarSerDe share it.

      Yes. The object inspector itself had no use for the null sequence. The ColumnarStruct itself knows the nullSequence and uses it as required.

      Have not run a complete test on this patch.

      Show
      Krishna Kumar added a comment - Common code moved to a base class to that columnarserde and LazyBinaryColumnarSerDe share it. Yes. The object inspector itself had no use for the null sequence. The ColumnarStruct itself knows the nullSequence and uses it as required. Have not run a complete test on this patch.
      Hide
      He Yongqiang added a comment -

      +1, looks good, will commit after tests pass.

      Show
      He Yongqiang added a comment - +1, looks good, will commit after tests pass.
      Hide
      He Yongqiang added a comment -

      committed, thanks Krishna Kumar!

      Show
      He Yongqiang added a comment - committed, thanks Krishna Kumar!
      Hide
      Hudson added a comment -

      Integrated in Hive-trunk-h0.21 #849 (See https://builds.apache.org/job/Hive-trunk-h0.21/849/)
      HIVE-956: add support of columnar binary serde (Krishna Kumar via He Yongqiang)

      heyongqiang : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1150978
      Files :

      • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarSerDe.java
      • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinarySerDe.java
      • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ColumnarStructObjectInspector.java
      • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStruct.java
      • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorFactory.java
      • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyObjectBase.java
      • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarSerDeBase.java
      • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyUtils.java
      • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarStruct.java
      • /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/columnar
      • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySimpleSerDe.java
      • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java
      • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarSerDe.java
      • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStructBase.java
      • /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/columnar/TestLazyBinaryColumnarSerDe.java
      • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java
      • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryFactory.java
      • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyObject.java
      • /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryObject.java
      Show
      Hudson added a comment - Integrated in Hive-trunk-h0.21 #849 (See https://builds.apache.org/job/Hive-trunk-h0.21/849/ ) HIVE-956 : add support of columnar binary serde (Krishna Kumar via He Yongqiang) heyongqiang : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1150978 Files : /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarSerDe.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinarySerDe.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ColumnarStructObjectInspector.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStruct.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorFactory.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyObjectBase.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarSerDeBase.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyUtils.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarStruct.java /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/columnar /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySimpleSerDe.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarSerDe.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStructBase.java /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/columnar/TestLazyBinaryColumnarSerDe.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryFactory.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyObject.java /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryObject.java

        People

        • Assignee:
          Krishna Kumar
          Reporter:
          He Yongqiang
        • Votes:
          0 Vote for this issue
          Watchers:
          5 Start watching this issue

          Dates

          • Created:
            Updated:
            Resolved:

            Development