Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-7065

Improve information stored in ATSv2 for MR jobs

Add voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      While exploring the possibility of retrieving every piece of information that JHS presents today through ATSv2, I found a few improvements we can make.

      1) MR tasks are split by type in JHS, map tasks or reduce tasks. They are indistinguishably stored as entities of type MR_TASK. We can split MR_TASK into MR_REDUCE_TASK and MR_MAP_TASK. Similarly for MR_TASK_ATTEMPT

      2) Task attempt final state are stored in the events, so we can not use infofilter to group task attempts by final state, which is what JHS does.

      3) Display names of counters are not stored in JHS. We are currently storing (counter name, display name, value) as a metric (counter name, value). We can potentially store (counter name, display name) as an info. Similarly for sources of Job configuration properties

      4) Job level counters and configuration properties are stored both in ApplicationTable and EntityTable. It's probably safe just to store MR specific counters in EntityTable.

       

      One general problem I see around this area in MR:

      1) We can precompute # of failed/killed/successful map/reduce task attempts and average map/reduce/shuffle/merge time in the AM. This would avoid iterating over all task attempts when JHS servers the Job Overview Page.

       

      To fully replace JHS with ATSv2, three functionalities need to be supported by ATSv2

      1) /apps/ query so that a list of all jobs can be retrieved (YARN-6058)

      2) support streaming api to get all generic entities (YARN-5627)

      3) support per-app data retention policy. Likely a setting in TimelineWriter that allow admins specifies how long information of a given application should be kepts, in the form of TTL in HBase.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            haibochen Haibo Chen
            haibochen Haibo Chen

            Dates

              Created:
              Updated:

              Issue deployment