Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-20377 Hive Kafka Storage Handler
  3. HIVE-20481

Add the Kafka Key record as part of the row.

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersConvert to IssueMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 4.0.0-alpha-1
    • None
    • None

    Description

      Kafka records are keyed, most of the case this key is null or used to route records to the same partition. This patch adds this column as a binary column

       __key

      .

      New table layout is as follow

      POSTHOOK: type: CREATETABLE
      POSTHOOK: Output: database:default
      POSTHOOK: Output: default@wiki_kafka_avro_table
      PREHOOK: query: describe extended wiki_kafka_avro_table
      PREHOOK: type: DESCTABLE
      PREHOOK: Input: default@wiki_kafka_avro_table
      POSTHOOK: query: describe extended wiki_kafka_avro_table
      POSTHOOK: type: DESCTABLE
      POSTHOOK: Input: default@wiki_kafka_avro_table
      isrobot             	boolean             	from deserializer   
      channel             	string              	from deserializer   
      timestamp           	string              	from deserializer   
      flags               	string              	from deserializer   
      isunpatrolled       	boolean             	from deserializer   
      page                	string              	from deserializer   
      diffurl             	string              	from deserializer   
      added               	bigint              	from deserializer   
      comment             	string              	from deserializer   
      commentlength       	bigint              	from deserializer   
      isnew               	boolean             	from deserializer   
      isminor             	boolean             	from deserializer   
      delta               	bigint              	from deserializer   
      isanonymous         	boolean             	from deserializer   
      user                	string              	from deserializer   
      deltabucket         	double              	from deserializer   
      deleted             	bigint              	from deserializer   
      namespace           	string              	from deserializer   
      __key               	binary              	from deserializer   
      __partition         	int                 	from deserializer   
      __offset            	bigint              	from deserializer   
      __timestamp         	bigint              	from deserializer   
      __start_offset      	bigint              	from deserializer   
      __end_offset        	bigint              	from deserializer  
      

      Attachments

        1. HIVE-20481.patch
          69 kB
          Slim Bouguerra

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            bslim Slim Bouguerra Assign to me
            bslim Slim Bouguerra
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment