Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4595

TestLostTracker failing - possibly due to a race in JobHistory.JobHistoryFilesManager#run()

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 1.0.3
    • 1.2.0
    • None
    • Reviewed

    Description

      The source for occasional failure of TestLostTracker seems like the following:

      On job completion, JobHistoryFilesManager#run() spawns another thread to move history files to done folder. TestLostTracker waits for job completion, before checking the file format of the history file. However, the history files move might be in the process or might not have started in the first place.

      The attachment (force-TestLostTracker-failure.patch) helps reproducing the error locally, by increasing the chance of hitting this race.

      Attachments

        1. force-TestLostTracker-failure.patch
          0.9 kB
          Karthik Kambatla
        2. MR-4595.patch
          1 kB
          Karthik Kambatla
        3. MR-4595.patch
          1 kB
          Karthik Kambatla

        Activity

          People

            kasha Karthik Kambatla
            kasha Karthik Kambatla
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: