Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-321

[Umbrella] Generic application history service

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • None
    • None
    • None

    Description

      The mapreduce job history server currently needs to be deployed as a trusted server in sync with the mapreduce runtime. Every new application would need a similar application history server. Having to deploy O(T*V) (where T is number of type of application, V is number of version of application) trusted servers is clearly not scalable.

      Job history storage handling itself is pretty generic: move the logs and history data into a particular directory for later serving. Job history data is already stored as json (or binary avro). I propose that we create only one trusted application history server, which can have a generic UI (display json as a tree of strings) as well. Specific application/version can deploy untrusted webapps (a la AMs) to query the application history server and interpret the json for its specific UI and/or analytics.

      Attachments

        1. HistoryStorageDemo.java
          3 kB
          Zhijie Shen
        2. AHS Diagram.pdf
          287 kB
          Zhijie Shen
        3. ApplicationHistoryServiceHighLevel.pdf
          55 kB
          Sandy Ryza
        4. Generic Application History - Design-20131219.pdf
          103 kB
          Vinod Kumar Vavilapalli

        Issue Links

          1.
          HistoryStorage Reader Interface for Application History Server Sub-task Closed Mayank Bansal
          2.
          Bootstrap ApplicationHistoryService module Sub-task Closed Vinod Kumar Vavilapalli
          3.
          HistoryStorage writer interface for Application History Server Sub-task Closed Zhijie Shen
          4.
          YARN-321 branch is broken due to applicationhistoryserver module's pom.xml Sub-task Closed Zhijie Shen
          5.
          Adding HDFS implementation for History Reader Interface Sub-task Resolved Mayank Bansal
          6.
          Defining the history data classes for the implementation of the reading/writing interface Sub-task Closed Zhijie Shen
          7.
          [YARN-321] Enable ResourceManager to write history data Sub-task Closed Zhijie Shen
          8.
          [YARN-321] History Service should create the webUI and wire it to HistoryStorage Sub-task Closed Zhijie Shen
          9.
          [YARN-321] Implementation of ApplicationHistoryProtocol Sub-task Closed Mayank Bansal
          10.
          [YARN-321] Add a testable in-memory HistoryStorage Sub-task Closed Zhijie Shen
          11.
          Update application_history_service.proto Sub-task Closed Zhijie Shen
          12.
          [YARN-321] Command Line Interface(CLI) for Reading Application History Storage Data Sub-task Closed Mayank Bansal
          13.
          RMContainer should collect more useful information to be recorded in Application-History Sub-task Closed Zhijie Shen
          14.
          Add a file-system implementation for history-storage Sub-task Closed Zhijie Shen
          15.
          [YARN-321] Adding ApplicationAttemptReport and Protobuf implementation Sub-task Closed Mayank Bansal
          16.
          [YARN-321] Add more APIs related to ApplicationAttempt and Container in ApplicationHistoryProtocol Sub-task Closed Mayank Bansal
          17.
          [YARN-321] Move classes from applicationhistoryservice.records.pb.impl package to applicationhistoryservice.records.impl.pb Sub-task Closed Devaraj Kavali
          18.
          Adding ApplicationHistoryManager responsible for exposing reports to all clients Sub-task Closed Mayank Bansal
          19.
          Optimizing the reading/writing operations of FileSystemHistoryStorage Sub-task Resolved Zhijie Shen
          20.
          [YARN-321] Enhance History Reader interface for Containers Sub-task Closed Mayank Bansal
          21.
          [YARN-321] Webservices REST API's support for Application History Sub-task Closed Zhijie Shen
          22.
          Adding AHS as service of RM Sub-task Resolved Zhijie Shen
          23.
          Improve toString implementation for PBImpls for AHS Sub-task Resolved Zhijie Shen
          24.
          [YARN-321] Adding ContainerReport and Protobuf implementation Sub-task Closed Mayank Bansal
          25.
          [YARN-321] Update artifact versions for application history service Sub-task Closed Mayank Bansal
          26.
          Script changes to start AHS as an individual process Sub-task Closed Mayank Bansal
          27.
          Generic history service should support application-acls Sub-task Closed Zhijie Shen
          28.
          Implement PB service and client wrappers for ApplicationHistoryProtocol Sub-task Closed Mayank Bansal
          29.
          Add AHSDelegationTokenSecretManager for ApplicationHistoryProtocol Sub-task Resolved Zhijie Shen
          30.
          AHS History Store Cache Implementation Sub-task Resolved Mayank Bansal
          31.
          Separate ApplicationAttemptStartDataProto and ApplicationAttemptRegisteredDataProto Sub-task Resolved Zhijie Shen
          32.
          Removing FINAL_SAVING from YarnApplicationAttemptState Sub-task Closed Zhijie Shen
          33.
          Revisit the output type of the reader interface Sub-task Resolved Zhijie Shen
          34.
          Limit the number of outstanding tfile writers in FileSystemApplicationHistoryStore Sub-task Resolved Zhijie Shen
          35.
          [YARN-321] AHS protocols need to be in yarn proto package name after YARN-1170 Sub-task Closed Vinod Kumar Vavilapalli
          36.
          ApplicationClientProtocol and ApplicationHistoryProtocol should expose analogous APIs Sub-task Closed Mayank Bansal
          37.
          [YARN-321] AHS WebUI should server aggregated logs as well Sub-task Resolved Mayank Bansal
          38.
          RM cannot write the final state of RMApp/RMAppAttempt to the application history store in the transition to the final state Sub-task Closed Zhijie Shen
          39.
          Make aggregated logs of completed containers available via REST API Sub-task Resolved Robert Kanter
          40.
          RM Tracking Links for purged applications needs a long-term solution Sub-task Resolved Mayank Bansal
          41.
          [YARN-321] Failing tests in org.apache.hadoop.yarn.server.applicationhistoryservice.* Sub-task Closed Vinod Kumar Vavilapalli
          42.
          [YARN-321] Merge Patch for YARN-321 Sub-task Closed Vinod Kumar Vavilapalli
          43.
          YARN-321 branch needs to be updated after YARN-888 pom changes Sub-task Closed Vinod Kumar Vavilapalli
          44.
          Test failures on YARN-321 branch Sub-task Closed Vinod Kumar Vavilapalli
          45.
          Javadoc failures on YARN-321 branch Sub-task Closed Vinod Kumar Vavilapalli
          46.
          FindBugs warnings on YARN-321 branch Sub-task Closed Vinod Kumar Vavilapalli
          47.
          Fix formatting issues with new module in YARN-321 branch Sub-task Closed Vinod Kumar Vavilapalli
          48.
          TestAHSWebApp failed in YARN-321 branch Sub-task Closed Shinichi Yamashita
          49.
          Fix how to read history file in FileSystemApplicationHistoryStore Sub-task Closed Shinichi Yamashita
          50.
          Fix config name YARN_HISTORY_SERVICE_ENABLED Sub-task Closed Akira Ajisaka
          51.
          mvn apache-rat:check outputs warning message in YARN-321 branch Sub-task Closed Shinichi Yamashita
          52.
          Fix history server heap size in yarn script Sub-task Closed Billie Rinaldi
          53.
          Bugs around log URL Sub-task Closed Zhijie Shen
          54.
          AHS records non-launched containers Sub-task Resolved Gera Shegalov
          55.
          Renaming applicationhistoryservice module Sub-task Resolved Zhijie Shen
          56.
          Review AHS configs and sync them up with the timeline-service configs Sub-task Closed Zhijie Shen
          57.
          yarn applicationattempt/container print wrong usage information Sub-task Closed Zhijie Shen
          58.
          Yarn CLI only shows running containers for Running Applications Sub-task Resolved Mayank Bansal
          59.
          ApplicationHistoryClientService#getApplications needs to be able to filter applications Sub-task Open Unassigned
          60.
          YarnClient will not be redirected to the history server when RM is done Sub-task Resolved Unassigned
          61.
          History client service needs to be more robust Sub-task Resolved Zhijie Shen
          62.
          Uniform the XXXXNotFound messages from ClientRMService and ApplicationHistoryClientService Sub-task Closed Zhijie Shen
          63.
          FileSystemApplicationHistoryStore blocks RM and AHS while NN is in safemode Sub-task Closed Jonathan Turner Eagles
          64.
          Removing old application history store after we store the history data to timeline store Sub-task Resolved Zhijie Shen
          65.
          Few fields displaying wrong values in Timeline server after RM restart Sub-task Resolved Naganarasimha G R
          66.
          Jobs are not displaying in timeline server after RM restart Sub-task Resolved Naganarasimha G R
          67.
          Investigating whether generic history service needs to support queue-acls Sub-task Resolved Sunil G
          68.
          Augment HistoryStorage Reader Interface to Support Filters When Getting Applications Sub-task Resolved Shinichi Yamashita
          69.
          FileSystemApplicationHistoryStore#HistoryFileReader#next() should check return value of dis.read() Sub-task Resolved Unassigned
          70.
          AHSClient may be not necessary Sub-task Resolved Zhijie Shen
          71.
          Generic history service RPC interface doesn't work when service authorization is enabled Sub-task Closed Zhijie Shen
          72.
          "yarn application -status <appId>" throws NPE when retrieving the app from the timelineserver Sub-task Closed Zhijie Shen
          73.
          Resource usage should be published to the timeline server as well Sub-task Closed Naganarasimha G R
          74.
          AHSWebServices should return FORBIDDEN(403) if the request user doesn't have access to the history data Sub-task Closed Zhijie Shen
          75.
          GHS should show N/A instead of null for the inaccessible information Sub-task Resolved Zhijie Shen
          76.
          ApplicationHistoryManager is expected to return a sorted list of apps/attempts/containers Sub-task Closed Robert Kanter
          77.
          Application (Attempt and Container) Not Found in AHS results in Internal Server Error (500) Sub-task Closed Mit Desai

          Activity

            People

              Unassigned Unassigned
              vicaya Luke Lu
              Votes:
              1 Vote for this issue
              Watchers:
              72 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: