Uploaded image for project: 'Oozie'
  1. Oozie
  2. OOZIE-1244

SLA Support in Oozie

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.0.0
    • Component/s: monitoring
    • Labels:
      None

      Description

      Would like to have the following features in Oozie

      • JMS notifications on SLA met, SLA start miss, SLA end miss and SLA duration miss
      • Email alerting for SLA start miss, SLA end miss and SLA duration miss
      • API to query SLA met/miss information. Currently the SLA information that can be queried is only SLA registration event and job status events. One has to calculate the actual misses from those.
      • A simple dashboard to view and query the SLA met/miss information built on the API mentioned above.

        Attachments

        1. OOZIE-1244.patch
          341 kB
          Mona Chitnis
        2. OOZIE-1244.patch
          349 kB
          Mona Chitnis
        3. OozieMonitoring-929-1244.pptx
          187 kB
          Rohini Palaniswamy

        Issue Links

        1.
        SLA Documentation Sub-task Closed Mona Chitnis

        0%

        Original Estimate - 24h
        Remaining Estimate - 24h
        Actions
        2.
        SLA Bootstrap Service Sub-task Resolved Ryota Egashira   Actions
        3.
        Extend SLA tag support to Coordinator Job and Bundle Job Sub-task Resolved Unassigned   Actions
        4.
        Generate SLA end_miss event only after confirming against persistent store Sub-task Resolved Mona Chitnis   Actions
        5.
        Improve memory footprint of Calculator Memory object Sub-task Open Unassigned   Actions
        6.
        REST API to fetch SLA Sub-task Closed Rohini Palaniswamy   Actions
        7.
        Fix flakey SLA tests Sub-task Closed Mona Chitnis   Actions
        8.
        UI for SLA Sub-task Closed Virag Kothari   Actions
        9.
        Coordinator job change command not removing SLA Registration bean Sub-task Closed Mona Chitnis   Actions
        10.
        Fix bugs around ActionKillX not setting end time, V2SLAServlet and exception handling for event threads Sub-task Closed Mona Chitnis

        0%

        Original Estimate - 2h
        Remaining Estimate - 2h
        Actions
        11.
        fix bug in SLARegistrationBean and CoordActionsCountForJobIdJPAExecutor Sub-task Closed Ryota Egashira   Actions
        12.
        Implement SLA Bootstrap Service and fix bugs in SLACalculator Sub-task Closed Virag Kothari   Actions
        13.
        Handle reruns for SLA notifications Sub-task Closed Virag Kothari   Actions
        14.
        Improve SLA reliability on restart, fix bugs related to SLA and event generation Sub-task Closed Virag Kothari   Actions
        15.
        Fix bugs in SLA UI Sub-task Closed Rohini Palaniswamy   Actions
        16.
        Fix bugs in SLA UI Sub-task Closed Rohini Palaniswamy   Actions
        17.
        Fix bugs related to coordchange and parentId in events Sub-task Closed Virag Kothari   Actions
        18.
        Remove SLACalculatorBean and add columns to SummaryBean indicating events processed and sla processed Sub-task Closed Mona Chitnis   Actions
        19.
        Make <should-end> optional in SLA for users only concerned about <duration> Sub-task Open Unassigned   Actions
        20.
        Purge rogue and stale entries from history set Sub-task Open Unassigned   Actions
        21.
        SLACalcStatus not updating the last modified time correctly and duplicate DURATION_* event Sub-task Resolved Virag Kothari   Actions
        22.
        Duplicate Coord_Action events on Waiting -> Timeout, and Coord Materialize not removing actions on Failure Sub-task Closed Mona Chitnis   Actions
        23.
        Confirm against database before generating start and duration miss events Sub-task Resolved Rohini Palaniswamy   Actions
        24.
        Duplicate end_miss events introduced by OOZIE-1472 Sub-task Closed Rohini Palaniswamy   Actions

          Activity

            People

            • Assignee:
              egashira Ryota Egashira
              Reporter:
              rohini Rohini Palaniswamy

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 26h
                26h
                Remaining:
                Remaining Estimate - 26h
                26h
                Logged:
                Time Spent - Not Specified
                Not Specified

                  Issue deployment