Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6480

archive-logs tool may miss applications

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.8.0
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: None
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      MAPREDUCE-6415 added a tool to archive aggregated logs into HAR files. It seeds the initial list of applications to process based on apps which have finished aggregated, according to the RM. However, the RM doesn't remember completed applications forever (e.g. failover), so it's possible for the tool to miss applications if they're no longer in the RM.

      Instead, we should do the following:

      1. Seed the initial list of apps based on the aggregated log directories
      2. Make the RM not consider applications "complete" until their log aggregation has reached a terminal state (i.e. DISABLED, SUCCEEDED, FAILED, TIME_OUT).

      #2 will allow #1 to assume that any apps not found in the RM are done aggregating. #1 on it's own should cover most cases though

        Attachments

        1. MAPREDUCE-6480.001.patch
          26 kB
          Robert Kanter
        2. MAPREDUCE-6480.002.patch
          30 kB
          Robert Kanter
        3. MAPREDUCE-6480.003.patch
          37 kB
          Robert Kanter

        Issue Links

          Activity

            People

            • Assignee:
              rkanter Robert Kanter
              Reporter:
              rkanter Robert Kanter

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment