Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-3302

Race condition in query plan for merging at the end of a query

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 0.10.0
    • 0.10.0
    • Query Processor
    • None
    • Reviewed

    Description

      In the query plan that's used to merge files at the end of a query, the dependency tree looks something like:
      MoveTask(1)
      / \
      ...ConditionalTask MoveTask(2)...
      \ /
      MergeTask

      Here MoveTask(1) moves the partition data to a temporary location, and MoveTask(2) moves it to the final location.

      However if there are dynamic partitions generated and some of these partitions are merged and others are moved, the dependency tree is changed at runtime to:
      ...ConditionalTask MoveTask(2)...
      \ /
      MergeTask
      \
      MoveTask(1)

      This produces a race condition between the two MoveTasks where if MoveTask(2) runs before MoveTask(1) the partitions moved by MoveTask(1) will get moved to an intermediate location and never moved to the final location. In this case those partitions are quietly lost.

      Attachments

        1. HIVE-3302.1.patch.txt
          457 kB
          Kevin Wilfong
        2. HIVE-3302.2.patch.txt
          535 kB
          Kevin Wilfong

        Activity

          People

            kevinwilfong Kevin Wilfong
            kevinwilfong Kevin Wilfong
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: