Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-25600

Compaction job creates redundant base/delta folder within base/delta folder

Log workAgile BoardRank to TopRank to BottomVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      Hive table 'aa.bb_item' is corrupt. Found sub-directory 'abfs://aabb-hive-data@gen2hiveaabb.dfs.core.windows.net/prod-data/aa.db/bb_item/part_created_on=202105/base_0004042/base_0004042' in bucket directory for partition: part_created_on=202105
       at io.prestosql.plugin.hive.BackgroundHiveSplitLoader.loadPartition(BackgroundHiveSplitLoader.java:543)
       at io.prestosql.plugin.hive.BackgroundHiveSplitLoader.loadSplits(BackgroundHiveSplitLoader.java:325)
       at io.prestosql.plugin.hive.BackgroundHiveSplitLoader$HiveSplitLoaderTask.process(BackgroundHiveSplitLoader.java:254)
       at io.prestosql.plugin.hive.util.ResumableTasks$1.run(ResumableTasks.java:38)
       at io.prestosql.$gen.Presto_347____20210615_143054_2.run(Unknown Source)
       at io.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:80)
       at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
       at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
       at java.base/java.lang.Thread.run(Thread.java:829);

      Why it happens:
      Multiple compaction jobs for the same transactions can be triggered if the HMS gets restarted and the MR job is still in progress.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            gupta.nikhil0007 Nikhil Gupta Assign to me
            gupta.nikhil0007 Nikhil Gupta
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - Not Specified
              Not Specified
              Remaining:
              Remaining Estimate - 0h
              0h
              Logged:
              Time Spent - 1h
              1h

              Issue deployment