Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-208

Provide an admin page displaying events in the cluster along with cluster status/health

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • None
    • None
    • None
    • None

    Description

      Here are few things that will help admins understand whats happening in the cluster

      1. Events updates
        1. recently added tracker
        2. lost trackers
        3. recently submitted jobs
        4. user updates
        5. killed/failed attempts/tasks
        6. killed jobs and the reason
        7. recent exceptions like oom etc
        8. expired tasks
        9. recovery manager updates
        10. memory/cpu usage
        11. black listing of tracker
        12. killing of maps based on fetch failures
        13. info about why some jobs was rejected(acls, max tasks)/failed(failures)/killed (user)
        14. etc
      2. Status :
        1. tracker health and status
        2. User status
          1. num jobs submitted
          2. total time the cluster was used
          3. success/failed/killed history
        3. job status
          1. task completion events
          2. recently scheduled tasks
          3. progress
          4. killed/failed/success history
        4. space on the box where the jt is running
        5. etc
      3. Config :
        1. slot info
        2. acl info
        3. etc

      Graphical views and auto updation would be cool. Raising alarms upon certain events would be super cool.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              amar_kamat Amar Kamat
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: