Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
Kafka records are keyed, most of the case this key is null or used to route records to the same partition. This patch adds this column as a binary column
__key
.
New table layout is as follow
POSTHOOK: type: CREATETABLE POSTHOOK: Output: database:default POSTHOOK: Output: default@wiki_kafka_avro_table PREHOOK: query: describe extended wiki_kafka_avro_table PREHOOK: type: DESCTABLE PREHOOK: Input: default@wiki_kafka_avro_table POSTHOOK: query: describe extended wiki_kafka_avro_table POSTHOOK: type: DESCTABLE POSTHOOK: Input: default@wiki_kafka_avro_table isrobot boolean from deserializer channel string from deserializer timestamp string from deserializer flags string from deserializer isunpatrolled boolean from deserializer page string from deserializer diffurl string from deserializer added bigint from deserializer comment string from deserializer commentlength bigint from deserializer isnew boolean from deserializer isminor boolean from deserializer delta bigint from deserializer isanonymous boolean from deserializer user string from deserializer deltabucket double from deserializer deleted bigint from deserializer namespace string from deserializer __key binary from deserializer __partition int from deserializer __offset bigint from deserializer __timestamp bigint from deserializer __start_offset bigint from deserializer __end_offset bigint from deserializer