[GIRAPH-950] Auto-restart from checkpoint doesn't pick up latest checkpoint - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 1.1.0
Fix Version/s: 1.1.0
Component/s: None
Labels:
None

Description

While running different jobs with checkpoints enabled I noticed some issues:
1) The way we pick up latest checkpoint is not correct. Current implementation just picks whatever is returned last from FileSystem.list(), which is not necessarily the last checkpoint
2) If job restarts from checkpoint it immediately creates another checkpoint.
3) We need more flexibility in GiraphJobRetryChecker to allow restarts after multiple failures.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Sergey Edunov

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 24/Sep/14 20:39

Updated:: 20/Oct/14 17:16

Resolved:: 20/Oct/14 17:16