Uploaded image for project: 'Oozie'
  1. Oozie
  2. OOZIE-1244

SLA Support in Oozie

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 4.0.0
    • monitoring
    • None

    Description

      Would like to have the following features in Oozie

      • JMS notifications on SLA met, SLA start miss, SLA end miss and SLA duration miss
      • Email alerting for SLA start miss, SLA end miss and SLA duration miss
      • API to query SLA met/miss information. Currently the SLA information that can be queried is only SLA registration event and job status events. One has to calculate the actual misses from those.
      • A simple dashboard to view and query the SLA met/miss information built on the API mentioned above.

      Attachments

        1. OozieMonitoring-929-1244.pptx
          187 kB
          Rohini Palaniswamy
        2. OOZIE-1244.patch
          349 kB
          Mona Chitnis
        3. OOZIE-1244.patch
          341 kB
          Mona Chitnis

        Issue Links

          1.
          SLA Documentation Sub-task Closed Mona Chitnis

          0%

          Original Estimate - 24h
          Remaining Estimate - 24h
          2.
          SLA Bootstrap Service Sub-task Resolved Ryota Egashira  
          3.
          Extend SLA tag support to Coordinator Job and Bundle Job Sub-task Resolved Unassigned  
          4.
          Generate SLA end_miss event only after confirming against persistent store Sub-task Resolved Mona Chitnis  
          5.
          Improve memory footprint of Calculator Memory object Sub-task Open Unassigned  
          6.
          REST API to fetch SLA Sub-task Closed Rohini Palaniswamy  
          7.
          Fix flakey SLA tests Sub-task Closed Mona Chitnis  
          8.
          UI for SLA Sub-task Closed Virag Kothari  
          9.
          Coordinator job change command not removing SLA Registration bean Sub-task Closed Mona Chitnis  
          10.
          Fix bugs around ActionKillX not setting end time, V2SLAServlet and exception handling for event threads Sub-task Closed Mona Chitnis

          0%

          Original Estimate - 2h
          Remaining Estimate - 2h
          11.
          fix bug in SLARegistrationBean and CoordActionsCountForJobIdJPAExecutor Sub-task Closed Ryota Egashira  
          12.
          Implement SLA Bootstrap Service and fix bugs in SLACalculator Sub-task Closed Virag Kothari  
          13.
          Handle reruns for SLA notifications Sub-task Closed Virag Kothari  
          14.
          Improve SLA reliability on restart, fix bugs related to SLA and event generation Sub-task Closed Virag Kothari  
          15.
          Fix bugs in SLA UI Sub-task Closed Rohini Palaniswamy  
          16.
          Fix bugs in SLA UI Sub-task Closed Rohini Palaniswamy  
          17.
          Fix bugs related to coordchange and parentId in events Sub-task Closed Virag Kothari  
          18.
          Remove SLACalculatorBean and add columns to SummaryBean indicating events processed and sla processed Sub-task Closed Mona Chitnis  
          19.
          Make <should-end> optional in SLA for users only concerned about <duration> Sub-task Open Unassigned  
          20.
          Purge rogue and stale entries from history set Sub-task Open Unassigned  
          21.
          SLACalcStatus not updating the last modified time correctly and duplicate DURATION_* event Sub-task Resolved Virag Kothari  
          22.
          Duplicate Coord_Action events on Waiting -> Timeout, and Coord Materialize not removing actions on Failure Sub-task Closed Mona Chitnis  
          23.
          Confirm against database before generating start and duration miss events Sub-task Resolved Rohini Palaniswamy  
          24.
          Duplicate end_miss events introduced by OOZIE-1472 Sub-task Closed Rohini Palaniswamy  

          Activity

            People

              egashira Ryota Egashira
              rohini Rohini Palaniswamy
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 26h
                  26h
                  Remaining:
                  Remaining Estimate - 26h
                  26h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified