Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4595

TestLostTracker failing - possibly due to a race in JobHistory.JobHistoryFilesManager#run()

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 1.0.3
    • 1.2.0
    • None
    • Reviewed

    Description

      The source for occasional failure of TestLostTracker seems like the following:

      On job completion, JobHistoryFilesManager#run() spawns another thread to move history files to done folder. TestLostTracker waits for job completion, before checking the file format of the history file. However, the history files move might be in the process or might not have started in the first place.

      The attachment (force-TestLostTracker-failure.patch) helps reproducing the error locally, by increasing the chance of hitting this race.

      Attachments

        1. force-TestLostTracker-failure.patch
          0.9 kB
          Karthik Kambatla
        2. MR-4595.patch
          1 kB
          Karthik Kambatla
        3. MR-4595.patch
          1 kB
          Karthik Kambatla

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            kasha Karthik Kambatla
            kasha Karthik Kambatla
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment