Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-3973

[Umbrella JIRA] JobHistoryServer performance improvements in YARN+MR

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.23.0
    • Fix Version/s: None
    • Component/s: jobhistoryserver, mrv2
    • Labels:
      None

      Description

      Few parallel efforts are happening w.r.t improving/fixing issues with JobHistoryServer in MR over YARN. This is the umbrella ticket so we have the complete picture.

        Activity

        Hide
        Ray Chiang added a comment -

        Just a quick FYI, I've added the three scaling issues for JobHistoryServer to this umbrella JIRA rather than open a second "JHS Performance Umbrella JIRA".

        Show
        Ray Chiang added a comment - Just a quick FYI, I've added the three scaling issues for JobHistoryServer to this umbrella JIRA rather than open a second "JHS Performance Umbrella JIRA".
        Hide
        Craig Welch added a comment -

        Robert Joseph Evans I would add that I think we should also change the AM/JobHistoryServer relationship such that the AM attempts to upload it's history at completion via a synchronous service call to the history server (which will write the data into the db), with a possible fallback to file upload for timeout/server down cases. The history server should no longer look at an intermediate-done directory in the line of requests but would continue to have a background thread to monitor and load from there for these failure cases/to pick up data submitted while offline.

        Show
        Craig Welch added a comment - Robert Joseph Evans I would add that I think we should also change the AM/JobHistoryServer relationship such that the AM attempts to upload it's history at completion via a synchronous service call to the history server (which will write the data into the db), with a possible fallback to file upload for timeout/server down cases. The history server should no longer look at an intermediate-done directory in the line of requests but would continue to have a background thread to monitor and load from there for these failure cases/to pick up data submitted while offline.
        Hide
        Craig Welch added a comment -

        Robert Joseph Evans, for what it's worth I strongly second the notion that the job history server should be using a database of some sort for it's storage layer, something which offers long term storage of a significant amount of data, the full job history we want to retain, and which handles the trade off between fast in-memory access and slower persistent storage for the service such that the service does not have to have a custom version of this logic. Whether it's relational as suggested or some other datastore (some combination of embedded leveldb + hbase, perhaps), in general this is a difficult problem which I think is better addressed by leveraging existing solutions (e.g. databases, as you suggest) then trying to continue to support the custom model we have. By moving this state into a datastore we could also support multiple jobhistory servers for scaling with little to no effort, as you suggest, which would be a major win.

        TL;DR +1

        Show
        Craig Welch added a comment - Robert Joseph Evans , for what it's worth I strongly second the notion that the job history server should be using a database of some sort for it's storage layer, something which offers long term storage of a significant amount of data, the full job history we want to retain, and which handles the trade off between fast in-memory access and slower persistent storage for the service such that the service does not have to have a custom version of this logic. Whether it's relational as suggested or some other datastore (some combination of embedded leveldb + hbase, perhaps), in general this is a difficult problem which I think is better addressed by leveraging existing solutions (e.g. databases, as you suggest) then trying to continue to support the custom model we have. By moving this state into a datastore we could also support multiple jobhistory servers for scaling with little to no effort, as you suggest, which would be a major win. TL;DR +1
        Hide
        Robert Joseph Evans added a comment -

        I have been thinking about how to speed up the job history server, and also how we are going to be able to cache all of the jobs that we have running on some of our 4000+ nodes clusters were we can run in excess of 60000 jobs a day. It seems to me that the job history server is already complex and is going to get to be even more complicated going forward as we try to add in yet another cache MAPREDUCE-3966 and possibly one more on top of that MAPREDUCE-3755. What is more all of these caches are backed by files in HDFS. We need to keep all of these caches consistent with one another, and with the operations that are going to happen to the files on HDFS, and do it with out having a single huge lock MAPREDUCE-3972. I really think that we could remove the vast majority of this complexity by changing how we cache this data.

        I would like to propose that we abstract away how we cache the data with a pluggable data access layer API. The default implementation of this API would store the data in an embedded version of Derby (http://db.apache.org/derby) or some other embedded SQL database that fits our licensing requirements. Because of the plug-ability it would allow users to replace derby with MySQL, Oracle, or even H-Base, with minimal effort. I think that this would drastically reduce the size and complexity of our code, it would speed up the code and web service a lot and it would open up the potential for us to move to a truly stateless history server.

        I would love to get something like this in on 0.23.3 instead of having to do much of the other work that has been suggested. I am not tied to this approach nor to this timeframe. I am looking for feedback on the idea before I file a JIRA for it. I know that there are a lot of potential issues here. When using a database to store the data we will now need to provide a mapping between the entries in the object and database tables, or serialize objects into blobs if we do not want to query on them. We are also going to have to eventually think about how we migrate the schema of the database from one version to another, if we want to keep the data in there long term. In the short term we can treat the DB as a true cache and blow it away each time the history server reboots.

        Show
        Robert Joseph Evans added a comment - I have been thinking about how to speed up the job history server, and also how we are going to be able to cache all of the jobs that we have running on some of our 4000+ nodes clusters were we can run in excess of 60000 jobs a day. It seems to me that the job history server is already complex and is going to get to be even more complicated going forward as we try to add in yet another cache MAPREDUCE-3966 and possibly one more on top of that MAPREDUCE-3755 . What is more all of these caches are backed by files in HDFS. We need to keep all of these caches consistent with one another, and with the operations that are going to happen to the files on HDFS, and do it with out having a single huge lock MAPREDUCE-3972 . I really think that we could remove the vast majority of this complexity by changing how we cache this data. I would like to propose that we abstract away how we cache the data with a pluggable data access layer API. The default implementation of this API would store the data in an embedded version of Derby ( http://db.apache.org/derby ) or some other embedded SQL database that fits our licensing requirements. Because of the plug-ability it would allow users to replace derby with MySQL, Oracle, or even H-Base, with minimal effort. I think that this would drastically reduce the size and complexity of our code, it would speed up the code and web service a lot and it would open up the potential for us to move to a truly stateless history server. I would love to get something like this in on 0.23.3 instead of having to do much of the other work that has been suggested. I am not tied to this approach nor to this timeframe. I am looking for feedback on the idea before I file a JIRA for it. I know that there are a lot of potential issues here. When using a database to store the data we will now need to provide a mapping between the entries in the object and database tables, or serialize objects into blobs if we do not want to query on them. We are also going to have to eventually think about how we migrate the schema of the database from one version to another, if we want to keep the data in there long term. In the short term we can treat the DB as a true cache and blow it away each time the history server reboots.
        Hide
        Vinod Kumar Vavilapalli added a comment -

        Focusing on performance issues only..

        Show
        Vinod Kumar Vavilapalli added a comment - Focusing on performance issues only..

          People

          • Assignee:
            Unassigned
            Reporter:
            Vinod Kumar Vavilapalli
          • Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

            • Created:
              Updated:

              Development