Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-20699 Query based compactor for full CRUD Acid tables
  3. HIVE-21266

Don't run cleaner if compaction is skipped (issue with single delta file)

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersConvert to IssueMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 4.0.0
    • 4.0.0-alpha-1
    • Transactions
    • None

    Description

      https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java#L353-L357

       

      if ((deltaCount + (dir.getBaseDirectory() == null ? 0 : 1)) + origCount <= 1) {
            LOG.debug("Not compacting {}; current base is {} and there are {} deltas and {} originals", sd.getLocation(), dir
                .getBaseDirectory(), deltaCount, origCount);
            return;
          }
       

      Is problematic.
      Suppose you have 1 delta file from streaming ingest: delta_11_20 where txnid:13 was aborted. The code above will not rewrite the delta (which drops anything that belongs to the aborted txn) and transition the compaction to "ready_for_cleaning" state which will drop the metadata about the aborted txn in markCleaned(). Now aborted data will come back as committed.

      Attachments

        1. HIVE-21266.04.patch
          16 kB
          Karen Coppage
        2. HIVE-21266.04.patch
          16 kB
          Karen Coppage
        3. HIVE-21266.04.patch
          16 kB
          Karen Coppage
        4. HIVE-21266.03.patch
          15 kB
          Karen Coppage
        5. HIVE-21266.02.patch
          13 kB
          Karen Coppage
        6. HIVE-21266.02.patch
          13 kB
          Karen Coppage
        7. HIVE-21266.01.patch
          11 kB
          Karen Coppage

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            klcopp Karen Coppage Assign to me
            ekoifman Eugene Koifman
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment