Uploaded image for project: 'Tajo (Retired)'
  1. Tajo (Retired)
  2. TAJO-1216

Output commit should be two phase commit

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None

    Description

      Problem

      The output data of each query are firstly stored in some temporary staging directory. Then, they are finally moved to the final output directory when all tasks are successfully completed. We call this step output commit.

      Currently, we use a simple way to just move an output data set to the final directory. But, this manner makes failure handle very hard.

      Solution

      In order to solve the problem, we need two-phase commit. This approach is as follows:

      • When each task is completed, the task request a commit pending to QueryMaster.
      • QueryMaster chooses only one commit pending possibly among multiple commit pending requests, and then response commit to a corresponding task.
      • Only one task which receives commit moves the result data to the final output directory. Others cancel commit works.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              hyunsik Hyunsik Choi
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: