Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-22413

Avoid dirty read when reading the ACID table while compaction is running

Log workAgile BoardRank to TopRank to BottomAdd voteVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Transactions
    • Labels:
      None

      Description

      There is a problem that dirty read occurs when reading the ACID table while base or delta directories are being created by the compactor. Especially it is highly likely to occur in the S3 storage because the “move” logic of S3 is “copy and delete”, and it takes a long time to copy if the size of files are large or bucketing count is large.

      So here’s the logic to avoid this problem. If “_tmp” prefixed directories are existed in the partition directory on the process of listing the child directories when reading the ACID table, compare the names of the directory in the “_tmp” one and skip it in case of the same. Then it will read the files before merging, no difference on the results.

        Attachments

        1. HIVE-22413.1.patch
          3 kB
          Hocheol Park

          Activity

          $i18n.getText('security.level.explanation', $currentSelection) Viewable by All Users
          Cancel

            People

              Dates

              • Created:
                Updated:

                Issue deployment