Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-2928 YARN Timeline Service v.2: alpha 1
  3. YARN-5095

flow activities and flow runs are populated with wrong timestamp when RM restarts w/ recovery enabled

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      I have the RM recovery enabled. I see that upon restart the RM populates records into flow activity and flow runs but with wrong timestamps. What I mean by the timestamp is the part of the row key:

      • flow activity: row created with the day of the RM restart
      • flow run: row created with the RM start time as the "run id"

      The following illustrates an example flow run:

      metrics: [ ],
      events: [ ],
      id: "sjlee@Sleep job/1463433569917",
      type: "YARN_FLOW_RUN",
      createdtime: 1463422860987,
      info: {
      UID: "yarn_cluster!sjlee!Sleep job!1463433569917",
      SYSTEM_INFO_FLOW_RUN_ID: 1463433569917,
      SYSTEM_INFO_FLOW_NAME: "Sleep job",
      SYSTEM_INFO_FLOW_RUN_END_TIME: 1463422865033,
      SYSTEM_INFO_USER: "sjlee"
      },
      isrelatedto: { },
      relatesto: { }
      

      The created time and the end time are correct (i.e. original time), whereas the timestamp in the row key (= run id: 1463433569917) is actually later than the end time and coincides with the RM restart.

      Attachments

        1. YARN-5095-YARN-2928.01.patch
          18 kB
          Varun Saxena
        2. YARN-5095-YARN-2928.02.patch
          10 kB
          Varun Saxena
        3. YARN-5095-YARN-2928.03.patch
          9 kB
          Varun Saxena

        Activity

          People

            varun_saxena Varun Saxena
            sjlee0 Sangjin Lee
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: