Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-13403

Make Streaming API not create empty buckets

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 1.3.0
    • 2.3.0
    • HCatalog, Transactions
    • None

    Description

      as of HIVE-11983, when a TransactionBatch is opened in StreamingAPI, a full compliment of bucket files (AbstractRecordWriter.createRecordUpdaters()) is created on disk even though some may end up receiving no data.

      It would be better to create them on demand and not clog the FS.

      Tez can handle missing (empty) buckets and on MR bucket join algorithms will check if all buckets are there and bail out if not.

      Attachments

        1. HIVE-13403.5.patch
          11 kB
          Wei Zheng
        2. HIVE-13403.4.patch
          10 kB
          Wei Zheng
        3. HIVE-13403.3.patch
          10 kB
          Wei Zheng
        4. HIVE-13403.2.patch
          4 kB
          Wei Zheng
        5. HIVE-13403.1.patch
          4 kB
          Wei Zheng

        Issue Links

          Activity

            People

              wzheng Wei Zheng
              ekoifman Eugene Koifman
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: