Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Problem
The output data of each query are firstly stored in some temporary staging directory. Then, they are finally moved to the final output directory when all tasks are successfully completed. We call this step output commit.
Currently, we use a simple way to just move an output data set to the final directory. But, this manner makes failure handle very hard.
Solution
In order to solve the problem, we need two-phase commit. This approach is as follows:
- When each task is completed, the task request a commit pending to QueryMaster.
- QueryMaster chooses only one commit pending possibly among multiple commit pending requests, and then response commit to a corresponding task.
- Only one task which receives commit moves the result data to the final output directory. Others cancel commit works.
Attachments
Issue Links
- is required by
-
TAJO-1214 (Umbrella) Improve task and node failure handling
- Open