Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-19205

Hive streaming ingest improvements (v2)

Log workAgile BoardRank to TopRank to BottomVotersWatch issueWatchersCreate sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments


    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.1.0, 3.0.0
    • Fix Version/s: 3.0.0
    • Component/s: Streaming
    • Labels:


      This is umbrella jira to track hive streaming ingest improvements. At a high level following are the improvements

      • Support for dynamic partitioning
      • API changes (simple streaming connection builder)
      • Hide the transaction batches from clients (client can tune the transaction batch but doesn't have to know about the transaction batch size)
      • Support auto rollover to next transaction batch (clients don't have to worry about closing a transaction batch and opening a new one)
      • Record writers will all be strict meaning the schema of the record has to match table schema. This is to avoid the multiple serialization/deserialization for re-ordering columns if there is schema mismatch
      • Automatic distribution for non-bucketed tables so that compactor can have more parallelism
      • Create delta files with all ORC overhead disabled (no index, no compression, no dictionary). Compactor will recreate the orc files with index, compression and dictionary encoding.
      • Automatic memory management via auto-flushing (will yield smaller stripes for delta files but is more scalable and clients don't have to worry about distributing the data across writers)
      • Support for more writers (Avro specifically. ORC passthrough format?)
      • Support to accept input stream instead of record byte[]
      • Removing HCatalog dependency (old streaming API will be in the hcatalog package for backward compatibility, new streaming API will be in its own hive module)



          $i18n.getText('security.level.explanation', $currentSelection) Viewable by All Users


            • Assignee:
              prasanth_j Prasanth Jayachandran Assign to me
              prasanth_j Prasanth Jayachandran


              • Created:

                Issue deployment