Uploaded image for project: 'Oozie'
  1. Oozie
  2. OOZIE-1527

Fix scalability issues with coordinator materialization

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • trunk
    • 4.1.0
    • coordinator
    • None

    Description

      In certain situations when there is a large number of coordinators in the system, they have been observed to create huge backlog in materialization, and progressing very slow compared to expected. This patch can be looked upon as both a bug-fix or an enhancement addressing following points:

      1. 'materialization.system.limit' leads to bringing Coord jobs in LRU fashion, but some of them may already be maxing out at actions to materialize (= throttle), and < limit jobs may actually undergo materialization. This patch does a second iteration of loading jobs to get materialized to reduce backlog

      2. 'materialization.window' being 1 hour may work in most cases, but hourly jobs are seen to face significant slowdown at times, by lot of other minute jobs getting materialized. Therefore, window can be doubled (i.e. 2 hours) when job is hourly/daily.

      3. For hourly coordinators, it is consistently seen that materialization occurs only near the end of the hour. e.g. for action whose nominal time is 2:00, action creation time is 1:59, if nominal time - 3:00, creation time is 2:58 and so on. If window is an hour in the future, doesn't explain why materialization won't occur anytime in the middle of the preceding hour.

      Attachments

        1. OOZIE-1527-V3.patch
          2 kB
          Purshotam Shah
        2. OOZIE-1527-V2.patch
          36 kB
          Purshotam Shah

        Issue Links

          Activity

            People

              puru Purshotam Shah
              chitnis Mona Chitnis
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 96h
                  96h
                  Remaining:
                  Remaining Estimate - 96h
                  96h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified