Uploaded image for project: 'Oozie'
  1. Oozie
  2. OOZIE-1527

Fix scalability issues with coordinator materialization

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: trunk
    • Fix Version/s: 4.1.0
    • Component/s: coordinator
    • Labels:
      None

      Description

      In certain situations when there is a large number of coordinators in the system, they have been observed to create huge backlog in materialization, and progressing very slow compared to expected. This patch can be looked upon as both a bug-fix or an enhancement addressing following points:

      1. 'materialization.system.limit' leads to bringing Coord jobs in LRU fashion, but some of them may already be maxing out at actions to materialize (= throttle), and < limit jobs may actually undergo materialization. This patch does a second iteration of loading jobs to get materialized to reduce backlog

      2. 'materialization.window' being 1 hour may work in most cases, but hourly jobs are seen to face significant slowdown at times, by lot of other minute jobs getting materialized. Therefore, window can be doubled (i.e. 2 hours) when job is hourly/daily.

      3. For hourly coordinators, it is consistently seen that materialization occurs only near the end of the hour. e.g. for action whose nominal time is 2:00, action creation time is 1:59, if nominal time - 3:00, creation time is 2:58 and so on. If window is an hour in the future, doesn't explain why materialization won't occur anytime in the middle of the preceding hour.

        Attachments

        1. OOZIE-1527-V2.patch
          36 kB
          Purshotam Shah
        2. OOZIE-1527-V3.patch
          2 kB
          Purshotam Shah

          Issue Links

            Activity

              People

              • Assignee:
                puru Purshotam Shah
                Reporter:
                chitnis Mona Chitnis
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - 96h
                  96h
                  Remaining:
                  Remaining Estimate - 96h
                  96h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified